Search code examples
pythonerror-messaging

Why does python print only one error message?


A lot of the languages I know of (like C++, C, Rust, etc.) print multiple error messages at a time. Then why does python print only one error message?


Solution

  • First off, I assume we are talking about syntax errors, i.e. those that can (and should) be detected and reported by the compiler.

    It is primarily a design choice. Python is basically built on the notion that everything should be done during runtime. And the compiler is deliberately kept as simple as possible.

    Simple and easy to understand, or complex and sophisticated: Simply put, you have a choice between either using a very simple compiler that is easy to understand and maintain, or you have a complex piece of machinery with sophisticated program analysis and optimisations.

    Languages like C, C++, and Rust take their strength from heavily optimising code during compilationg and thus took the second route with highly complex and extremely sophisticated compilers. Dealing with syntax errors is rather one of their less impressive feats.

    Python, on the other hand, went to other way. Indeed, in general, it is quite impossible for a Python compiler to predict what exacty a piece of Python code is doing without actually running it—which precludes all the interesting opportunities for optimisation in the first place, and hence a sophisticated compiler would not really make sense, anyway. Keeping Python's compiler simple and rather focus on runtime optimisations is thus the right choice. But it comes with the downside that the compiler simply bails out whenever it discovers an error.


    To give a bit more context...

    1. Error Recovery

    Dealing with errors and recover from a syntax error in a compiler is hard.

    Compilers are usually very good at translating (syntactically) correct programs fast and efficiently to machine code that represents the original program. However, if there is a syntax error, it is often impossible for the compiler to guess the original intention of the programmer, and it is therefore not clear what to do with an erroneous piece of code.

    Here is a very simple example:

    pen color("red")
    

    Obviously, there is something wrong here, but without further context it is impossible to tell, whether the original intent of this line was pen = color("red"), pencolor("red"), pen.color("red") or something else entirely.

    If a compiler wants to continue with looking at the rest of the program (and thus discover potentially more syntax errors), it needs a stretegy of how to cope with such situations and recover so as to move on: it needs an error recovery strategy. This might something as simple as just skipping the entire line or individual tokens, but there is no clear-cut "correct" solution to this.

    2. Python's Compiler

    Python compiles your program one symbol at a time.

    Python's current compiler works by looking at one symbol at a time (called a LL(1) compiler). This makes it extremely simple to automatically build the compiler for Python, and it is quite fast and efficient. But it means that there are situations where, despite an "obvious" syntax error, Python happily moves on compiling the program until it is really lost.

    Take a look at this example:

    x = foo(
    y = bar()
    if x > y:
    

    As humans, we quickly see the missing closing parenthesis in line 1. However, from the compiler's perspective, this looks rather like a call with a named argument, something like this:

    x = foo(y = bar() if x > y else 0)
    

    Accordingly, Python will only notice that something is wrong when it hits the colon in line 3—the first symbol that does not work with its "assumption". But at that point, it is extremely hard to figure out what to do with this piece of code, and how to correctly recover: do you just skip the colon in this case? Or should you go back and correct something earlier on—and if so, how far back are you going?

    3. Follow-up Errors

    Error recovery can create "ghost" errors.

    In the first example above, the compiler could just skip the entire line and move on without any issue. But there are situations, where the choice of how to recover from a syntax errors influences (potentially) everything that follows, as in this example:

    deffoo(x):
    

    The intent behind this could either be def foo(x): or simply a call deffoo(x). But this distinction determines how the compiler will look at the code that follows, and either report an indentation error, or perhaps a return outside a function, etc.

    The danger of error recovery is that the compiler's guess might actually be wrong, which could lead to a whole series of reported follow-up errors—which might not even be true errors, but rather ghosts created by the compiler's wrong decision.


    Bottom-line: getting error recovery and error reporting right is extremely hard. Python's choice to only report the first syntax error it encounters is thus sensible and works for most users and situations just fine.

    I have actually written a parser with more sophisticated error detection, which can list all errors it discovers in a Python program. But to my experience, too many of the additional errors beyond the first one are just rubbish, and I therefore always stuck to displaying only the first error in a program.