Search code examples
pythonregexstringparsingstring-parsing

How to parse values appear after the same string in python?


I have a input text like this (actual text file contains tons of garbage characters surrounding these 2 string too.)

(random_garbage_char_here)**value=xxx**;(random_garbage_char_here)**value=yyy**;(random_garbage_char_here)

I am trying to parse the text to store something like this: value1="xxx" and value2="yyy". I wrote python code as follows:

value1_start = content.find('value')
value1_end = content.find(';', value1_start)

value2_start = content.find('value')
value2_end = content.find(';', value2_start)


print "%s" %(content[value1_start:value1_end])
print "%s" %(content[value2_start:value2_end])

But it always returns:

value=xxx
value=xxx

Could anyone tell me how can I parse the text so that the output is:

value=xxx
value=yyy

Solution

  • Use a regex approach:

    re.findall(r'\bvalue=[^;]*', s)
    

    Or - if value can be any 1+ word (letter/digit/underscore) chars:

    re.findall(r'\b\w+=[^;]*', s)
    

    See the regex demo

    Details:

    • \b - word boundary
    • value= - a literal char sequence value=
    • [^;]* - zero or more chars other than ;.

    See the Python demo:

    import re
    rx = re.compile(r"\bvalue=[^;]*")
    s = "$%$%&^(&value=xxx;$%^$%^$&^%^*value=yyy;%$#^%"
    res = rx.findall(s)
    print(res)