Search code examples
regexpowershellregex-lookarounds

Regex using negative lookahead missing first character of group2


I need to get the LDAP group names from this example string:

"user.ldap.groups.name" = "M-Role13" AND ("user.ldap.groups.name"= "M Role1" OR "user.ldap.groups.name" = "M.Group-Role16" OR "user.ldap.groups.name"="Admin Role"  ) AND "common.platform" = "iOS" AND ( AND "ios.PersonalHotspotEnabled" = true  ) AND "common.retired" = False

I'm using this regex to match the parts of the string that contains an LDAP group

("user\.ldap\.groups\.name"?.=.?".+?(.*?)")(?!"user\.ldap\.groups\.name")

but it is matching in group2 the name without the first character. https://regex101.com/r/2Aby6K/1


Solution

  • A few notes about the pattern you tried

    • The reason it misses the first character is because this part .+? requires at least a single character
    • Note that in this part "?.=.?" it matches an optional ", an equals sign between any char due to the dot where the second dot is optional and then "
    • This part (.*?)")(?!"user\.ldap\.groups\.name") uses a non greedy dot .*? which will give up as least as possible to satisfy the condition to match a " which is not directly followed by user.ldap.groups.name. See an example of an incorrect match.

    What you might do is use a negated character class

    "user\.ldap\.groups\.name"\s*=\s*"([^"]+)"
    

    In parts

    • "user\.ldap\.groups\.name" Match
    • \s*=\s* Match = between 0+ whitespace chars on the left and right
    • "( Match " and start capturing group
      • [^"]+ Match any char except " 1+ times
    • )" Close group and match "

    Regex demo


    Or if you want to include the negative lookahead:

    "user\.ldap\.groups\.name"\s*=\s*"([^"]+)"(?!"user\.ldap\.groups\.name")
    

    Regex demo