Search code examples
phpregexstringpreg-match

Insert at most one leading and trailing newline after each word


I have strings that contain delimiters. The delimiters need to contain leading and trailing newlines but they do not. Three examples:

[heading 1]content[heading 2]content[heading 3]content

[heading 1]content↵
[heading 2]content↵
[heading 3]content

[heading 1]↵
content[heading 2]↵
content[heading 3]↵
content

I need to normalize the data using regular expression find-replace. Each delimiter must have leading and trailing newline, the first delimiter must have trailing newline only:

[heading 1]↵
content↵
[heading 2]↵
content↵
[heading 3]↵
content

I have tried this find-replace pattern (regexr) but it does not work in all cases:

find: \[.+?\](?!\r\n)
repl: $0\r\n

Update: I prefer a one-regex-only solution that does not require pre and post-processing such as replace and trim.


Solution

  • This should work for you:

    First just replace all new lines with str_replace() and then you can easily add a new line after each [heading] and the content with preg_replace(), e.g.

    <?php
    
        $str = "[heading 1]content[heading 2]content[heading 3]content";
        $str = trim(preg_replace("/(\[.+?\])((?:[^\[])+)/", "$1" . PHP_EOL . "$2" . PHP_EOL, str_replace(PHP_EOL, "", $str)));
    
        highlight_string($str);
    
    ?>
    

    output:

    [heading 1]
    content
    [heading 2]
    content
    [heading 3]
    content
    

    EDIT:

    If you only want to use a regex, you could do something like this:

    <?php
    
        $str = "[heading 1]content[heading 2]content[heading 3]content
    
    [heading 1]content
    [heading 2]content
    [heading 3]content
    
    [heading 1]
    content[heading 2]
    content[heading 3]
    content";
    
        $str = preg_replace("/\s*(\[.+?\]|(?![^\[]+$(*PRUNE))[^\[]+)\s*/", "$1" . PHP_EOL, $str);
        var_dump($str);
    
    ?>