Hello,
I got a problem with extracting specific phrases withing a long list of rows. Here is an example of 6 rows of my long document:
We sell big and small blue widgets at http
/www.bluewidgetsdomain.com/
Our website is http
/www.bluewidgetsdomain.com/
We sell many kinds of widgets. Go to this site for green widgets at http
/www.green-widgets-domain.net/
Our website is http
/www.green-widgets-domain.net/
We sell widgets. Check out red widgets at http
/www.red-widgets-domain.org/
Our website is http
/www.red-widgets-domain.org/
Qn 1) How can I extract the words bluewidgetsdomain, green-widgets-domain, red-widgets-domain from each row and delete the rest of the words
Qn 2) For the rows that have the phrase [widgets at], I want to extract all the words before [widgets at]. I would also like to know how to extract all the words after [widgets at]
Qn 3) I want to extract all domains with ending with .com only. (example, in this example the http
/www.bluewidgetsdomain.com/ will be extracted)
Qn 4) I want to extract the words between [We sell] and [at]. (example, for row one, the extracted words will be [big and small blue widgets], for row 3 the extracted words will be [many kinds of widgets. Go to this site for green widgets], for row 5 the extracted words will be [widgets. Check out red widgets] )
Qn 5) If the domain have dashes, I want to remove the dashes. (example, http
/www.green-widgets-domain.net/ will become http
/www.greenwidgetsdomain.net/)
Qn 6) I want to remove all the slash at the end of the domains. (example, http
/www.green-widgets-domain.net/ will become http
/www.green-widgets-domain.net)
Qn 7) I want to delete all rows that start with [Our website]
I appreciate any help. Thanks in advance!
I got a problem with extracting specific phrases withing a long list of rows. Here is an example of 6 rows of my long document:
We sell big and small blue widgets at http

Our website is http

We sell many kinds of widgets. Go to this site for green widgets at http

Our website is http

We sell widgets. Check out red widgets at http

Our website is http

Qn 1) How can I extract the words bluewidgetsdomain, green-widgets-domain, red-widgets-domain from each row and delete the rest of the words
Qn 2) For the rows that have the phrase [widgets at], I want to extract all the words before [widgets at]. I would also like to know how to extract all the words after [widgets at]
Qn 3) I want to extract all domains with ending with .com only. (example, in this example the http

Qn 4) I want to extract the words between [We sell] and [at]. (example, for row one, the extracted words will be [big and small blue widgets], for row 3 the extracted words will be [many kinds of widgets. Go to this site for green widgets], for row 5 the extracted words will be [widgets. Check out red widgets] )
Qn 5) If the domain have dashes, I want to remove the dashes. (example, http


Qn 6) I want to remove all the slash at the end of the domains. (example, http


Qn 7) I want to delete all rows that start with [Our website]
I appreciate any help. Thanks in advance!