Search code examples
pythonregexpython-re

How to match a string of the form 'word=A B|word=C|word=D E word=E|word=F G'?


I am trying to match strings of the form word=A B|word=C|word=D E word=F|word=G H using Python's re module. The string is in one line. This is the output that I want:

word=A B|word=C|word=D E , word=F|word=G H

This is my regex till now

word=(?:[^word]\w+\s*\w+(\|?:=[^word]*)?)

It's an incomplete regex for now, and I have been trying to improve upon it with no desirable output.


Solution

  • If you want to match the entire string, use below regex

    (?:word=[\w |]+(?=word|$))+
    

    Demo

    If you want to just place comma, considering | as delimiter use below regex, with re.sub().

    \s(?=word=)
    

    Example

    import re
    text="word=A B|word=C|word=D E word=F|word=G H"
    print(re.sub(r"\s(?=word=)",",",text))
    

    Output

    word=A B|word=C|word=D E,word=F|word=G H