Search code examples
regexregex-lookarounds

Singleline RegEx issue


I'm struggling with a RegEx expression. I like to match a text similar to the example below.

The text consist of blocks, starting always with wlan. The RegEx should only match, if the line dot1 authent exists in the block.

The SSIDx behind wlan should match with the 1st RegEx group. The word behind dot1 authent should be match in the 2nd RegEx group.

Group 1 Group 2 Result
SSID1 TESTTEST1 OK
SSID3 TESTTEST3 OK
SSID4 TESTTEST4 OK
wlan SSID1 SSID1
 test2
 test1
 dot1 authent TESTTEST1
 test3
wlan SSID2 SSID2
 test21
 test11
 test31
wlan SSID3 SSID3
 test22
 test12
 dot1 authent TESTTEST3
 test32
wlan SSID4 SSID4
 test23
 test13
 dot1 authent TESTTEST4
 test33

The following RegEx expression does match almost the desired content.

(?s:wlan (.+?)\s.+?dot1 authent (.+?)\n)

Unfortunately in case a wlan group doesn't contain the line dot1 authent, the RegEx match the following wlan group causing a wrong match.

In the example the matches are the following:

Group1 Group 2 Result
SSID1 TESTTEST1 OK
SSID2 TESTTEST3 NOT OK
SSID4 TESTTEST4 OK

SSID2 should not match as dot1 authent is not defined for this group. Instead, it should match SSID3.

I added (?!wlan) in the RegEx, but this didn't have any effect.

(?s:wlan (.+?)\s.+?(?!wlan)dot1 authent (.+?)\n)

Can anyone give me a hint, what I did wrong and how to achieve this match?

Many thanks


Solution

  • You can use a negative lookahead to prevent crossing the lines that make it a separate block and match the dot1 authend with spaces to the left and right at least once per block.

    ^wlan (\S+).*(?:\r?\n(?!wlan|.*? dot1 authent ).*)*\r?\n.*? dot1 authent (.+).*(?:\r?\n(?!wlan).*)*
    

    See a regex demo


    Or only with \n and word boundaries \b for the dot1 authent match, and stop matching after the first occurrence of the group 2 value:

    ^wlan (\S+).*(?:\n(?!wlan|.*?\bdot1 authent\b).*)*\n.*?\bdot1 authent\b(.+)
    

    Regex demo