Search code examples
regexmacostextreplacepcre

Regex to unwrap paragraphs: Remove returns and new lines at end of lines that have content, but not empty lines


I'm using Text Soap by Unmarked Software on Mac OS which is pretty much PCRE but uses ICU Regular Expression Syntax for its regex find and replace tool. I'm still new to Regex so I'm still learning the many intricacies. Please be patient with me.

I'm struggling to capture new lines or returns at the end of lines that have content, but not capture the new lines or returns of empty lines, or if there is an empty line immediately following.

I've tried using positive lookbehind, and positive lookahead with multiline mode but haven't been able to figure it out. With a bit of trial and error I did figure out that $ is after newline/carriage return.

I am essentially trying to unwrap paragraphs but maintain them as paragraphs.

I want input such as this example:

"I need to unblock," someone may have breathed out.\n
\n
"I know how to do it," I may have responded, picking up\n
the cue. My life has always included strong internal directives.\n
Marching orders) I call them.\n
\n
In any case, I suddenly knew that I did know how to un-\n
block people and that I was meant to do so, starting then and\n
there with the lessons I myself had learned.\n
\n
Where did the lessons come from?\n
\n
In 1978, in January, I stopped drinking. I had never\n
thought drinking made me a writer, but now I suddenly\n
thought not drinking might make me stop. In my mind,\n
drinking and writing went together like, well, scotch and\n
soda. For me, the trick was always getting past the fear and\n
onto the page. I was playing beat the clock-trying to write be-\n
fore the booze closed in like fog and my window of creativity\n
was blocked again.\n

To output this:

"I need to unblock," someone may have breathed out.\n
\n
"I know how to do it," I may have responded, picking up the cue. My life has always included strong internal directives. Marching orders) I call them.\n
\n
In any case, I suddenly knew that I did know how to un-block people and that I was meant to do so, starting then and there with the lessons I myself had learned.
\n
Where did the lessons come from?\n
\n
In 1978, in January, I stopped drinking. I had never thought drinking made me a writer, but now I suddenly thought not drinking might make me stop. In my mind, drinking and writing went together like, well, scotch and soda. For me, the trick was always getting past the fear and onto the page. I was playing beat the clock-trying to write be-fore the booze closed in like fog and my window of creativity was blocked again.\n

Solution

  • If I understand correctly, you can use this regex:

    (?<!\n)\n(?!\n)
    

    replace with empty string.

    If you want to look for characters other than new lines, you can replace all the \n with the character/string that you want to find instead. For example, if your newline is \r\n. use:

    (?<!\r\n)\r\n(?!\r\n)
    

    Essentially, the regex finds a newline that neither follows nor is followed by another newline. And replacing by an empty string removes it.