Hello,
I got a problem with extracting specific phrases withing a long list of rows. Here is an example of 6 rows of my long document:
We sell big and small blue widgets at http/www.bluewidgetsdomain.com/
Our website is http/www.bluewidgetsdomain.com/
We sell many kinds of widgets. Go to this site for green widgets at http/www.green-widgets-domain.net/
Our website is http/www.green-widgets-domain.net/
We sell widgets. Check out red widgets at http/www.red-widgets-domain.org/
Our website is http/www.red-widgets-domain.org/
Qn 1) How can I extract the words bluewidgetsdomain, green-widgets-domain, red-widgets-domain from each row and delete the rest of the words
Qn 2) For the rows that have the phrase [widgets at], I want to extract all the words before [widgets at]. I would also like to know how to extract all the words after [widgets at]
Qn 3) I want to extract all domains with ending with .com only. (example, in this example the http/www.bluewidgetsdomain.com/ will be extracted)
Qn 4) I want to extract the words between [We sell] and [at]. (example, for row one, the extracted words will be [big and small blue widgets], for row 3 the extracted words will be [many kinds of widgets. Go to this site for green widgets], for row 5 the extracted words will be [widgets. Check out red widgets] )
Qn 5) If the domain have dashes, I want to remove the dashes. (example, http/www.green-widgets-domain.net/ will become http/www.greenwidgetsdomain.net/)
Qn 6) I want to remove all the slash at the end of the domains. (example, http/www.green-widgets-domain.net/ will become http/www.green-widgets-domain.net)
Qn 7) I want to delete all rows that start with [Our website]
I appreciate any help. Thanks in advance!
I got a problem with extracting specific phrases withing a long list of rows. Here is an example of 6 rows of my long document:
We sell big and small blue widgets at http/www.bluewidgetsdomain.com/
Our website is http/www.bluewidgetsdomain.com/
We sell many kinds of widgets. Go to this site for green widgets at http/www.green-widgets-domain.net/
Our website is http/www.green-widgets-domain.net/
We sell widgets. Check out red widgets at http/www.red-widgets-domain.org/
Our website is http/www.red-widgets-domain.org/
Qn 1) How can I extract the words bluewidgetsdomain, green-widgets-domain, red-widgets-domain from each row and delete the rest of the words
Qn 2) For the rows that have the phrase [widgets at], I want to extract all the words before [widgets at]. I would also like to know how to extract all the words after [widgets at]
Qn 3) I want to extract all domains with ending with .com only. (example, in this example the http/www.bluewidgetsdomain.com/ will be extracted)
Qn 4) I want to extract the words between [We sell] and [at]. (example, for row one, the extracted words will be [big and small blue widgets], for row 3 the extracted words will be [many kinds of widgets. Go to this site for green widgets], for row 5 the extracted words will be [widgets. Check out red widgets] )
Qn 5) If the domain have dashes, I want to remove the dashes. (example, http/www.green-widgets-domain.net/ will become http/www.greenwidgetsdomain.net/)
Qn 6) I want to remove all the slash at the end of the domains. (example, http/www.green-widgets-domain.net/ will become http/www.green-widgets-domain.net)
Qn 7) I want to delete all rows that start with [Our website]
I appreciate any help. Thanks in advance!