Search code examples
regexpcreregex-lookaroundsregex-negation

Regexp - joining multiple lines not starting with dash


I have lines that look like this:

- test 1
  test test test
  test test test
  test test test
- test2
- test3
  test test t
  test test test
- test 4
  test test test
- test5

I am looking for a regexp to convert them into this:

- test 1
  test test test test test test test test test
- test2
- test3
  test test t test test test
- test 4
  test test test
- test5

That is to remove all new line after each line that does not begin with \s*?\- and that does not precede a line that begins with \s*?\-


Solution

  • How about something like

    ^(\h*[^-\s].*)\R(?!-)
    

    and replace with $1

    • ^ matches line start
    • (\h*[^-\s].*) first group captures: Any amount of h-space followed by a character, that is not - or \s whitespace, followed by any amount of any characters
    • \R(?!-) newline sequence that's not followed by a hyphen

    See this demo at regex101

    For joining parts by just one space see this version and replace with $1 (a bit less efficient).