Search code examples
pythonpython-3.xtraceback

Different syntax errors hide line in output


I have a script which calls compile.

try:
    code = compile('3 = 3', 'test', 'exec')
except Exception as e:
    sys.stderr.write(''.join(traceback.format_exception_only(type(e), e)))

3 = 3 results in:

File "test", line 1
SyntaxError: can't assign to literal

Whereas 3 = 3a actually prints the line

File "test", line 1
    3 = 3a
        ^
SyntaxError: invalid syntax

Any idea why that is?


Solution

  • Python produces SyntaxError exceptions in two places:

    1. when parsing, which is driven by the Python grammar
    2. when creating the Abstract Syntax Tree (AST) from the parse result; the AST drives the compiler.

    That's because the Python grammar has a few special cases where it was easier to keep the grammar simple but then make additional checks once parsing is complete and building the AST where additional syntax checks are done.

    Assignments are one of those places, because the rules for what is allowed on the left of the = sign differs from what is allowed on the right but are still closely related. The left-hand side is the target side, and targets can be structured like lists or tuples (unpacking assignments), and you can assign to attributes or indexing operations (listobj[1] = ..., etc.). But to have the parser detect that a target is actually a literal and not a variable name or attribute, etc, would require a very different parser structure, so this is left to the AST instead.

    So your 3 = 3 error passes the parsing stage but then fails at the later AST 'assignment target' check phase, while 3 = 3a falls at the parser stage (where 3a is easy to spot as an error).

    To give you a good syntax error, exceptions raised by the parser contain the source code line in the exception:

    >>> try:
    ...     code = compile('3 = 3a', 'test', 'exec')
    ... except Exception as e:
    ...     print(repr(e))
    ...
    SyntaxError('invalid syntax', ('test', 1, 6, '3 = 3a\n'))
    

    Note the ('test', 1, 6, '3 = 3a\n') tuple in the exception; these are available via the SyntaxError attributes filename, lineno (the line number), offset (the column offset) and text for the source code line itself. For the parser, this is easily provided, as it has access to the source code.

    But the AST doesn't have the source code. It only has the filename, line number, column, and parse tree objects. It does not have the original source text. It would normally try to read that from the filename, but test is not actually a file. So the line is empty:

    >>> try:
    ...     code = compile('3 = 3', 'test', 'exec')
    ... except Exception as e:
    ...     print(repr(e))
    ...
    SyntaxError('cannot assign to literal', ('test', 1, 1, ''))
    

    You can test for this and fix it by replacing the SyntaxError exception with a new one with the empty string replaced with your source text:

    >>> source = '3 = 3'
    >>> try:
    ...     code = compile(source, 'test', 'exec')
    ... except Exception as e:
    ...     if isinstance(e, SyntaxError) and not e.text:
    ...         sline = source.splitlines(True)[e.lineno - 1]
    ...         e = SyntaxError(e.msg, (e.filename, e.lineno, e.offset, sline))
    ...     sys.stderr.write(''.join(traceback.format_exception_only(type(e), e)))
    ...
      File "test", line 1
        3 = 3
        ^
    SyntaxError: cannot assign to literal
    

    Note that for a multi-line source string you’d want to split that source into lines and use the .lineno attribute to select the indicated source line.

    The alternative is to write the source code to a temporary filename, and pass that filename to compile() so that when a SyntaxError exception is found when building the AST, Python can then open that temporary file and find the corresponding text line.

    Note that when you use the special filename '<string>', that no attempt is made to find the source code for a line and e.text is set to None:

    >>> try:
    ...     code = compile('3 = 3', '<string>', 'exec')
    ... except Exception as e:
    ...     print(repr(e))
    ...
    SyntaxError('cannot assign to literal', ('<string>', 1, 1, None))
    

    and when the .text attribute is set to None, the traceback module forgoes printing the line-and-marker section.

    If you are interested in exactly why the Python grammar parser won't detect literals in the assignment target, you might be interested in the work Guido van Rossum is doing in writing a different parser for Python, which includes exposition on why the current parser works the way it does and how an alternative parser model can avoid these issues.