Search code examples
pythonpython-3.xsyntax-errorpython-internals

Printing without parentheses varying error message using Python 3


When I try to use print without parentheses on a simple name in Python 3.4 I get:

>>> print max
Traceback (most recent call last):
  ...
  File "<interactive input>", line 1
    print max
            ^
SyntaxError: Missing parentheses in call to 'print'

Ok, now I get it, I just forgot to port my Python 2 code.

But now when I try to print the result of a function:

>>> print max([1,2])
Traceback (most recent call last):
    ...
    print max([1,2])
            ^
SyntaxError: invalid syntax

Or:

print max.__call__(23)
        ^
SyntaxError: invalid syntax

(Note that the cursor is pointing to the character before the first dot in that case.)

The message is different (and slightly misleading, since the marker is below the max function).

Why isn't Python able to detect the problem earlier?

Note: This question was inspired by the confusion around this question: Pandas read.csv syntax error, where a few Python experts missed the real issue because of the misleading error message.


Solution

  • Looking at the source code for exceptions.c, right above _set_legacy_print_statement_msg there's this nice block comment:

    /* To help with migration from Python 2, SyntaxError.__init__ applies some
     * heuristics to try to report a more meaningful exception when print and
     * exec are used like statements.
     *
     * The heuristics are currently expected to detect the following cases:
     *   - top level statement
     *   - statement in a nested suite
     *   - trailing section of a one line complex statement
     *
     * They're currently known not to trigger:
     *   - after a semi-colon
     *
     * The error message can be a bit odd in cases where the "arguments" are
     * completely illegal syntactically, but that isn't worth the hassle of
     * fixing.
     *
     * We also can't do anything about cases that are legal Python 3 syntax
     * but mean something entirely different from what they did in Python 2
     * (omitting the arguments entirely, printing items preceded by a unary plus
     * or minus, using the stream redirection syntax).
     */
    

    So there's some interesting info. In addition, in the SyntaxError_init method in the same file, we can see

        /*
         * Issue #21669: Custom error for 'print' & 'exec' as statements
         *
         * Only applies to SyntaxError instances, not to subclasses such
         * as TabError or IndentationError (see issue #31161)
         */
        if ((PyObject*)Py_TYPE(self) == PyExc_SyntaxError &&
                self->text && PyUnicode_Check(self->text) &&
                _report_missing_parentheses(self) < 0) {
            return -1;
        }
    

    Note also that the above references issue #21669 on the python bugtracker with some discussion between the author and Guido about how to go about this. So we follow the rabbit (that is, _report_missing_parentheses) which is at the very bottom of the file, and see...

    legacy_check_result = _check_for_legacy_statements(self, 0);
    

    However, there are some cases where this is bypassed and the normal SyntaxError message is printed, see MSeifert's answer for more about that. If we go one function up to _check_for_legacy_statements we finally see the actual check for legacy print statements.

    /* Check for legacy print statements */
    if (print_prefix == NULL) {
        print_prefix = PyUnicode_InternFromString("print ");
        if (print_prefix == NULL) {
            return -1;
        }
    }
    if (PyUnicode_Tailmatch(self->text, print_prefix,
                            start, text_len, -1)) {
    
        return _set_legacy_print_statement_msg(self, start);
    }
    

    So, to answer the question: "Why isn't Python able to detect the problem earlier?", I would say the problem with parentheses isn't what is detected; it is actually parsed after the syntax error. It's a syntax error the whole time, but the actual minor piece about parentheses is caught afterwards just to give an additional hint.