Search code examples
regexpcre

Using OR in negative lookahead


Given an input like @1=A1@2=A2@3=A3>>@1=B1@2=B2@3=B3>>@1=C1@2=C2@3=C3>>@1=B1@2=B2@3=B3. I want to capture what is after @2= when @3=B3 and also verify that subsequently, @2= should contain the same value which was captured OR the value should be "ABC"

The patterns that should match are:

@1=A1@2=A2@3=A3>>@1=B1@2=B2@3=B3>>@1=C1@2=B2@3=C3>>@1=B1@2=B2@3=B3 @1=A1@2=A2@3=A3>>@1=B1@2=B2@3=B3>>@1=C1@2=B2@3=C3>>@1=B1@2=ABC@3=B3 @1=A1@2=A2@3=A3>>@1=B1@2=B2@3=B3>>@1=C1@2=B2@3=C3

The pattern that should not match @1=A1@2=A2@3=A3>>@1=B1@2=B2@3=B3>>@1=C1@2=B2@3=C3>>@1=B1@2=B10@3=B3 @1=A1@2=A2@3=A3>>@1=B1@2=B2@3=B3>>@1=B1@2=B10@3=B3>>@1=B1@2=B2@3=B3

I am able to do the part when it should match the entire string using negative lookaround. But I am not able to the OR part i.e. @2=ABC if the string does not match.

https://regex101.com/r/eCYCtg/1


Solution

  • Note your current regex matches when the repeating @2= has the value starting with the captured value before. You need to add @ in the negative lookahead, (?!\1@).

    To fix the pattern as you need you need to add ABC@ as an alternative to this lookahead: (?!\1@|ABC@). It will now fail the negative lookahead check (and thus will allow the match to occur) if the entire @2 value is ABC or the same value as captured before into Group 1.

    You may use

    ^(?:(?!@2=[^@]*@3=B3(?:[@>]|$)).)*@2=([^@]*)@3=B3(?:[@>]|$)(?!.*@2=(?!\1@|ABC@)[^@]*@3=B3(?:[@>]|$))
    

    See the regex demo.