Search code examples
pythonregexpython-re

Regex: expression to match 'name1: A=a name2:B=b name3:C=c d'


I am using the re module's re.compile() of Python to split name1:A=a name2:B=b name3:C=c d into:

name1 A=a, name2 B=b, name3 C=c d

This is my regex for now:

(\w+): (A|B|C)(=[\w+\s*\w*]+)

But it ends up giving me the output as:

name1: A=a name2: B=b name3: C=c d

The bold text is the text that it is capturing. The words A,B and C are from a predefined list of headings i.e. only these will occur before an '=' sign.


Solution

  • Instead of splitting you could try to match the relevant parts:

    import re
    
    text = "name1:A=a name2:B=b name3:C=c d"
    
    rx = re.compile(r'\w+:(?:\w+(?:=\w+)?(?:\s+|$))+')
    
    for match in rx.finditer(text):
        name, rest = match.group(0).split(":")
        print("{}, {}".format(name, rest))
    

    This yields

    name1, A=a 
    name2, B=b 
    name3, C=c d
    

    See a demo for the expression on regex101.com.