Search code examples
pythonregexregex-lookaroundslookbehind

Regex until character but if not preceded by another character


I wanted to create a regex to match a string that sharts with Localize(" and should end when a " pops up, but not when " is escaped (preceded by \).

My current regex which doesnt take into acount that "unless preceded by" looks like:

\bLocalize\(\"(.+?)(?=\")

Any ideas ?

EDIT

With the following string:

Localize("/Windows/Actions/DeleteActionWarning=The action you are trying to \"delete\" is referenced in this document.") + " Want to Proceed ?";

I want it to stop after document. comes, because it is the first " to show up without a trailing \ (which shows up around delete)


Solution

  • You may use

    \bLocalize\("([^"\\]*(?:\\.[^"\\]*)*)
    

    See this regex demo.

    Details:

    • \bLocalize - a whole word Localize
    • \(" - a (" substring
    • ([^"\\]*(?:\\.[^"\\]*)*) - Capturing group 1:
      • [^"\\]* - 0 or more chars other than " and \
      • (?:\\.[^"\\]*)* - 0 or more repetitions of an escaped char followed with 0 or more chars other than " and \

    In Python, declare the pattern with

    reg = r'\bLocalize\("([^"\\]*(?:\\.[^"\\]*)*)'
    

    Demo:

    import re
    reg = r'\bLocalize\("([^"\\]*(?:\\.[^"\\]*)*)'
    s = "Localize(\"/Windows/Actions/DeleteActionWarning=The action you are trying to \\\"delete\\\" is referenced in this document.\") + \" Want to Proceed ?\";"
    m = re.search(reg, s)
    if m:
        print(m.group(1))
    # => /Windows/Actions/DeleteActionWarning=The action you are trying to \"delete\" is referenced in this document.