Search code examples
regexregex-lookarounds

Similarities and differences between negative look head and except


I have the following regex:

(?<t>[^:=]*)(:(?<u>[^:=]*))?(:(?<v>[^:=]*))?=(?<value>.*)

Its purpose is to take input like AppSettings:Envionment:Other=Dev and parse it out. And it works great for that.

I now need to change it to parse out AppSettings__Envionment__Other=Dev. Replacing the single colon with a double underscore.

The problem with that is that the ^ operator works on one char only.

In searching, I have found that when you need to match on several chars at a time you should use the negative lookahead feature (?!\_\_). I tried that out (temporarily taking out the need for an equal sign):

(?<t>(?!\_\_)*)(\_\_(?<u>(?!\_\_)*))?(\_\_(?<v>(?!\_\_)))

An input of AppSettings__Envionment__Other it did not get the captures right (t, u and v should be the words between the underscores).

Additionally, I still need it to not match on both the double underscore (__) and the equals (=). Running full ahead of what I would thing should work (but does not) I tried this:

(?<t>(?!\_\_)(?!=)*)(\_\_(?<u>(?!\_\_)(?!=)*))?(\_\_(?<v>(?!\_\_)(?!=)*))?=(?<value>.*)

But that does not even come close to working. (It only matches on the equal sign.)

How can I go from not matching on : and = to not matching on __ and =?

And how would you make a statement that uses ^ for several chars equivalent to several negative lookaheads?

UPDATE:

I should have provided more examples of the inputs that I expect to parse. All of the following need to be parsed by the RegEx:

AppSettings__Envionment__Other=Dev
AppSettings__ConnectionString=SomeValue
LogLevel=Debug

Solution

  • You may use this regex for __ string:

    (?<t>.*?)(__(?<u>.*?))?(__(?<v>[^=]*))?=(?<value>.*)
    

    RegEx Demo

    RegEx Details:

    • (?<t>.*?): Match 0 or more of any character other than = and capture it in group t (lazy match)
    • (__(?<u>.*?))?: optionally match __ followed by 0 or more of any character and capture it in group u
    • (__(?<v>[^=]*))?: optionally match __ followed by 0 or more of any character other than = and capture it in group v
    • =: Match a =
    • (?<value>.*):

    Another option is to use tempered greedy pattern (a bit longish) like this:

    (?<t>(?:(?!__).)*)(__(?<u>(?:(?!__).)*))?(__(?<v>[^=]*))?=(?<value>.*)