Search code examples
pythonmypy

Why is the mypy FAQ mentioning performance impact?


As far as I understood, mypy is a tool that will check python code that includes type annotations.

However, in the FAQ, I read the following:

Mypy only does static type checking and it does not improve performance. It has a minimal performance impact.

In the second sentence, "minimal" seems to imply that there is a performance impact, (albeit minimal).

Why would mypy impact performance? I thought that in the end, the code still had to be run by the python interpreter, so mypy (or any other tool that analyses code like flake8, or pylint) shouldn't have any impact, positive or negative, on performance.

Is it because the source code size is larger due to extra type annotations?


Solution

  • The FAQ talks about performance of your Python code.

    In some programming languages, type hints can help steer a just-in-time compiler towards more efficient compilation of the hinted code, and so improve performance. In Python this is not the case, the language runtime doesn't make use of type hints, which are treated as nothing more than metadata.

    The minimal performance impact then comes from the extra bytecode needed to run the hint definitions (imports, TypeVar assignments, and interpreting the annotations themselves). That impact is truly minimal, even when creating classes and functions repeatedly.

    You can make the impact visible by using type hints in code run via exec(); this is an extreme case where we add a lot more overhead to code that does very little:

    >>> import timeit
    >>> without_hints = compile("""def foo(bar): pass""", "", "exec")
    >>> with_hints = compile(
    ...     "from typing import List\ndef foo(bar: List[int]) -> None: pass",
    ...     "", "exec")
    >>> without_metrics = timeit.Timer('exec(s)', 'from __main__ import without_hints as s').autorange()
    >>> with_metrics = timeit.Timer('exec(s)', 'from __main__ import with_hints as s').autorange()
    >>> without_metrics[1] / without_metrics[0] * (10e6)
    4.217094169580378
    >>> with_metrics[1] / with_metrics[0] * (10e6)   # microseconds per execution
    19.113581199781038
    

    So adding type hints added ~15 microseconds of execution time, as Python has to import the List object from typing, and attach the hints to the function object created.

    15 microseconds is minimal for anything defined at the top level of a module, which only needs to be imported once.

    You can see this when you disassemble the bytecode generated. Compare the version without hints:

    >>> dis.dis(without_hints)
      1           0 LOAD_CONST               0 (<code object foo at 0x10ace99d0, file "<dis>", line 1>)
                  2 LOAD_CONST               1 ('foo')
                  4 MAKE_FUNCTION            0
                  6 STORE_NAME               0 (foo)
                  8 LOAD_CONST               2 (None)
                 10 RETURN_VALUE
    
    Disassembly of <code object foo at 0x10ace99d0, file "<dis>", line 1>:
      1           0 LOAD_CONST               0 (None)
                  2 RETURN_VALUE
    

    with the version that is hinted:

    >>> import dis
    >>> dis.dis(with_hints)
      1           0 LOAD_CONST               0 (0)
                  2 LOAD_CONST               1 (('List',))
                  4 IMPORT_NAME              0 (typing)
                  6 IMPORT_FROM              1 (List)
                  8 STORE_NAME               1 (List)
                 10 POP_TOP
    
      2          12 LOAD_NAME                1 (List)
                 14 LOAD_NAME                2 (int)
                 16 BINARY_SUBSCR
                 18 LOAD_CONST               2 (None)
                 20 LOAD_CONST               3 (('bar', 'return'))
                 22 BUILD_CONST_KEY_MAP      2
                 24 LOAD_CONST               4 (<code object foo at 0x10ace99d0, file "<dis>", line 2>)
                 26 LOAD_CONST               5 ('foo')
                 28 MAKE_FUNCTION            4 (annotations)
                 30 STORE_NAME               3 (foo)
                 32 LOAD_CONST               2 (None)
                 34 RETURN_VALUE
    
    Disassembly of <code object foo at 0x10ace99d0, file "<dis>", line 2>:
      2           0 LOAD_CONST               0 (None)
                  2 RETURN_VALUE
    

    Python 3.7 introduced PEP 563 -- Postponed Evaluation of Annotations, aimed at reducing this cost a little and making forward references easier. For the over-simplified example above this doesn't actually reduce the time taken as loading the pre-defined annotations also takes some time:

    >>> pep563 = compile(
    ...     "from __future__ import annotations\nfrom typing import List\ndef foo(bar: List[int]) -> None: pass",
    ...     "", "exec")
    >>> pep563_metrics = timeit.Timer('exec(s)', 'from __main__ import pep563 as s').autorange()
    >>> pep563_metrics[1] / pep563_metrics[0] * (10e6)   # microseconds per execution
    19.314851402305067
    

    but for more complex, real-life type hinting projects this does make a small difference.