Search code examples
pythonregex

Output compiler error to a txt file in python


I'm a beginner using python. I want to create a regular expression to capture error messages from compiler output in python. How would I do this?

for example, if the compiler output is the following error message:

Traceback (most recent call last):
  File "sample.py", line 1, in <module>
    hello
NameError: name 'hello' is not defined

I want to be able to only extract only the following string from the output:

NameError: name 'hello' is not defined

In this case there is only one error, however I want to extract all the errors the compiler outputs. How do I do this using regular expressions? Or if there is an easier way, I'm open to suggestions


Solution

  • r'Traceback \(most recent call last\):\n(?:[ ]+.*\n)*(\w+: .*)'
    

    should extract your exception; a traceback contains lines that all start with whitespace except for the exception line.

    The above matches the literal text of the traceback first line, 0 or more lines that start with at least one space, and then captures the line following that provided it starts with 1 or more word characters (which fits Python identifiers nicely), a colon, and then the rest up to the end of a line.

    Demo:

    >>> import re
    >>> sample = '''\
    ... Traceback (most recent call last):
    ...   File "sample.py", line 1, in <module>
    ...     hello
    ... NameError: name 'hello' is not defined
    ... '''
    >>> re.search(r'Traceback \(most recent call last\):\n(?:[ ]+.*\n)*(\w+: .*)', sample).groups()
    ("NameError: name 'hello' is not defined",)