Search code examples
pythonregexmismatch

Python regex search, match mismatch


I'm trying to check syntax of input file, where are rules for my project.

I want to check if that would be right or not. So I have my regex

\s*.*\$\s*..*\$\s*\|}\s*.*\s*,*

Which finds this text:

sometimes $so$ |} hello,
life $good$ |} hello, 
not $that$ |} hello

Now in python I'm using re.findall to find the correct text, join the found patterns and then I compare it with the length of starting text. But for some reason it doesn't work.

The code: rule_syntax_check = re.findall("\s*.*\$\s*..*\$\s*\|}\s*.*\s*,*", RULES, re.DOTALL)

For example that would lead to error:

sometimes $so$ |} hello,
life $good$ |  } hello, 
not $that$ |} hello

But it finds the second line too, so the number of characters is the same as the number of found characters by my findall. Is there any other option, or what I'm missing?


Solution

  • The problem is exactly that you are using re.DOTALL a.k.a S flag. DOTALL means that the dot matches even newlines; if you take it out, the match cannot span to a new line.


    However a better solution would be to test each record separately; e.g., if they are separated by comma, you'd first split by ,, then use re.match to match a single rule against the regular expression. Note that re.match is not anchored to the end of the string, so you need to add extra $ to make sure that match against the exact string is required (though it is not necessary here):

    Something like:

    rules_split = RULES.split(',')
    for i in rules_split:
        if not re.match(r'\s*.*\$\s*.+\$\s*\|}.*$')