Search code examples
pythonapifiletext

How to obtain substring of big string text in Python?


I have the following format of text files, which are outputs of an API:

TASK [Do this]
OK: {
    "changed":false,
    "msg": "check ok"
}

TASK [Do that]
OK

TASK [Do x]
Fatal: "Error message x"

TASK [Do y]
OK

TASK [Do z]
Fatal: "Stopped because of previous error"

The amount of lines, or tasks before and after the "Fatal" error are random, and I am only interested in the "Error message x" part.

Code as of now:

url = # API URL 
r = request.get(url, verify=False, allow_redirects=True, headers=headers, timeout=10)
output = r.text

I tried using a combination of output.split("Fatal", 1)[1] but it seems to return list index out of range, while also messing up the text, adding a lot of \n.


Solution

  • You can use the re package to use a regular expression to search for the text you need. There are probably more optimal regex, but I wrote this one quickly using regex101.com: Fatal: "(.+)"

    import re
    
    s = '''TASK [Do this]
    OK: {
        "changed":false,
        "msg": "check ok"
    }
    
    TASK [Do that]
    OK
    
    TASK [Do x]
    Fatal: "Error message x"
    
    TASK [Do y]
    OK
    
    TASK [Do z]
    Fatal: "Stopped because of previous error"'''
    
    errors = re.findall(r'Fatal: "(.+)"', s)
    
    for x in errors:
        print(x)