Search code examples
regexpython-3.xrecursiondata-extraction

Python: How to use regex to find a repetitive string


I have some data that I want to extract/ output when a keyword is found in the block of data. How can I retrieve all the data from the first '#' to the last ')' using regular expression?

//Log_1.txt
# DON'T WANT #
{12345.54321}
[Tues Jul 2 01:23:45 2019]
< SOME_TYPE 
(some_ID = [12345] reportChange::someMoreInfo called with invalid some ID)

# DON'T WANT #
{12345.54321}
[Tues Jul 2 01:23:45 2019]
< SOME_TYPE 
(some_ID = [12345] failed::someMoreInfo called with invalid some ID)

CODE

import re

with open("Log_1.txt", 'r') as f:
    result = re.search('#(.*)#', f.read())

print(result.group(0))

This isn't all of my code but if the keyword is "reportChange", the output should be >>>

# DON'T WANT #
  .
  .
  .
(some_ID = [12345] reportChange::someMoreInfo called with invalid some ID)

instead of

# DON'T WANT #

Solution

  • Assuming you want from the latest # DON'T WANT # you can use the regex #(.*)#[^)]+yourKeyWordHere[^)]+\). In python you can use string formatting and have {} in place of the keyword to replace with whatever word you want.

    import re
    
    keyword='reportChange'
    
    with open("Log_1.txt", 'r') as f:
        result = re.search('#(.*)#[^)]+{}[^)]+\)'.format(keyword), f.read())
    
    print(result.group(0))