Search code examples
phpregexbbcode

Remove nested quotes


I have this text and I'm trying to remove all the inner quotes, just keeping one quoting level. The text inside a quote contains any characters, even line feeds, etc. Is this possible using a regex or I have to write a little parser?

[quote=foo]I really like the movie. [quote=bar]World 

War Z[/quote] It's amazing![/quote]
This is my comment.
[quote]Hello, World[/quote]
This is another comment.
[quote]Bye Bye Baby[/quote]

Here the text I want:

[quote=foo]I really like the movie.  It's amazing![/quote]
This is my comment.
[quote]Hello, World[/quote]
This is another comment.
[quote]Bye Bye Baby[/quote]

This is the regex I'm using in PHP:

%\[quote\s*(=[a-zA-Z0-9\-_]*)?\](.*)\[/quote\]%si

I tried also this variant, but it doesn't match . or , and I can't figure what else I can find inside a quote:

%\[quote\s*(=[a-zA-Z0-9\-_]*)?\]([\w\s]+)\[/quote\]%i

The problem is located here:

(.*)

Solution

  • You can use this:

    $result = preg_replace('~\G(?!\A)(?>(\[quote\b[^]]*](?>[^[]+|\[(?!/?quote)|(?1))*\[/quote])|(?<!\[)(?>[^[]+|\[(?!/?quote))+\K)|\[quote\b[^]]*]\K~', '', $text);
    

    details:

    \G(?!\A)              # contiguous to a precedent match
    (?>                   ## content inside "quote" tags at level 0
      (                    ## nested "quote" tags (group 1)
        \[quote\b[^]]*]
        (?>                ## content inside "quote" tags at any level
          [^[]+
         |                  # OR
          \[(?!/?quote)
         |                  # OR
          (?1)              # repeat the capture group 1 (recursive)
        )*
        \[/quote]
      )
     |
      (?<!\[)           # not preceded by an opening square bracket
      (?>              ## content that is not a quote tag
        [^[]+           # all that is not a [
       |                # OR
        \[(?!/?quote)   # a [ not followed by "quote" or "/quote"
      )+\K              # repeat 1 or more and reset the match
    )
    |                   # OR
    \[quote\b[^]]*]\K   # "quote" tag at level 0