I have a large text file that I am trying to separate into a CSV. Right now there are no line breaks, but each line that I want to separate will end with the regular expression url is \S+.
I am using bbedit to find and hopefully extract the lines. I originally tried putting a line break after it found that regex, but if I put url is \S+\n into the replace section it is taken literally and my url is gone. Some expressions I've tried:
\burl is \S+
\b.*url is \S+
$url is \S+
.*$url is \S+
url is \S+ $
url is \S+\$
The syntax of each line is
<message>, post has <#> likes, profile is <name>, url is <characters>
So an example of the document is:
message 1 here, post has 37 likes, profile is name1, url is 8gjEobL1U4 message 2, some messages have commas in them, post has 182 likes, profile is name2, url is 89PI4JOscv here is another message, post has 105 likes, profile is someoneelse, url is 89baAOzDLj
With GNU grep:
grep -oP '.*? url is [^ ]+ *' file
Output:
message 1 here, post has 37 likes, profile is name1, url is 8gjEobL1U4 message 2, some messages have commas in them, post has 182 likes, profile is name2, url is 89PI4JOscv here is another message, post has 105 likes, profile is someoneelse, url is 89baAOzDLj