Search code examples
regexsednlptext-processing

Replace spaces with new lines if part of a specific pattern using sed and regex with extended syntax


so I have a text file with multiple instances looking like this:

word. word or words [something:'else]

I need to replace with a new line the double space after every period followed by a sequence of words and then a "[", like so:

word.\nword or words [something:'else]

I thought about using the sed command in bash with extended regex syntax, but nothing has worked so far... I've tried different variations of this:

sed -E 's/(\.)( )(.*)(.\[)/\1\n\3\4/g' old.txt > new.txt

I'm an absolute beginner at this, so I'm not sure at all about what I'm doing 😳


Solution

  • This might work for you (GNU sed):

    sed -E 's/\.  ((\w+ )+\[)/\.\n\1/g' file
    

    Replace globally a period followed by two spaces and one or more words space separated followed by an opening square bracket by; a period followed by a newline followed by the matching back reference from the regexp.