Search code examples
regexpreg-match

Matching tags not between [a] and [/a] tags


I have content like this:

[a]
[b]
[a]
[c]
[/a]
[c]
[a]
[b]
[c]
[/a]    
[/a]
[b]
[b]
[b]
[c]

I want to get the [b] or [c] tags that are not between [a] and [/a] tags. My content has nested [a][/a] tags but preg_match should not select [b] or [c] nested there.

How can I do this?


Solution

  • Try: (?&a)|(\[[bc]\])|(*F)(?:(?'a'\[a\](?&nest)\[/a\])|(?'nest'((?&a)|\[[bc]\]|\s)*))

    (Ensure you're using the single-line modifier)

    I've rewritten it as a context-free grammar.

    Explanation

    • It's magic

    • (?&a)|(\[[bc]\]) Find things matching the pattern "a", but only capture [b]s or [c]s outside of the "a" patterns.

    • (*F)(?:(?'a'\[a\](?&nest)\[/a\])|(?'nest'((?&a)|\[[bc]\]|\s)*))

      • This is a hackish way to declare subpatterns to use in the match, without the declarations actually being a part of the match.
    • (?'a'\[a\](?&nest)\[/a\]) Pattern "a" is [a][/a] and can have the "nest" pattern within it.

    • (?'nest'((?&a)|\[[bc]\]|\s)*) The "nest" pattern is made of either another pattern "a", or a space, or [b] or [c].