Search code examples
pythonregexregex-grouppython-regex

regex with repeated group names


I'm trying to make a regex where I have some duplicated group names, for instance, in the example below I want to find the values of ph, A and B such that if I replace them in the pattern, I retrieve string. I do this using regex, as the default re library of Python does not allow to duplicate names.

pattern = '(?P<ph>.*?) __ (?P<A>.*?) __ (?P<B>.*?) __ \( (?P<ph>.*?) \-> (?P<A>.*?) = (?P<B>.*?) \) \)'
string = 'y = N __ ( A ` y ) __ ( A ` N ) __ ( y = N -> ( A ` y ) = ( A ` N ) ) )'
match = regex.fullmatch(pattern, string)
for k, v in match.groupdict().items():
    print(f'{k}: {v}')

And I retrieve the expected output:

ph: y = N
A: ( A ` y )
B: ( A ` N )

My concern, is that there seems to be some issues with this library, or I'm not using it properly. For instance, if I replace string with: string = 'BLABLA __ ( A ` y ) __ ( A ` N ) __ ( y = N -> ( A ` y ) = ( A ` N ) ) )'

then the code above provides the exact same values for ph, A and B, ignoring the BLABLA prefix at the beginning of string, and match should be None as there are no solutions.

Any ideas?

Note: more precisely, in my problemsI have pairs of patterns/strings (p_0, s_0) ... (p_n, s_n) and I have to find a valid match across these pairs, so I concatenated them together with a __ delimiter, but I am also curious if there is a proper way to do this.


Solution

  • Since you want to make sure the first three groups are equal to the corresponding next three groups you need to use backreferences to the first three groups rather than use the identically named capturing groups again:

    ^(?P<ph>.*?) __ (?P<A>.*?) __ (?P<B>.*?) __ \( (?P=ph) \-> (?P=A) = (?P=B) \) \)$
    

    See the regex demo

    Here, (?P=ph), (?P=A) and (?P=B) are named backreferences that match the same text as captured into the groups with corresponding names.

    The ^ and $ anchors are not necessary in your code since you use the regex.fullmatch method, but you need them when you test your pattern online in a regex tester.