Search code examples
pythoninterpreterpython-internals

Does Python optimize away a variable that's only used as a return value?


Is there any ultimate difference between the following two code snippets? The first assigns a value to a variable in a function and then returns that variable. The second function just returns the value directly.

Does Python turn them into equivalent bytecode? Is one of them faster?

Case 1:

def func():
    a = 42
    return a

Case 2:

def func():
    return 42

Solution

  • No, it doesn't.

    The compilation to CPython byte code is only passed through a small peephole optimizer that is designed to do only basic optimizations (See test_peepholer.py in the test suite for more on these optimizations).

    To take a look at what's actually going to happen, use dis* to see the instructions generated. For the first function, containing the assignment:

    from dis import dis
    dis(func)
      2           0 LOAD_CONST               1 (42)
                  2 STORE_FAST               0 (a)
    
      3           4 LOAD_FAST                0 (a)
                  6 RETURN_VALUE
    

    While, for the second function:

    dis(func2)
      2           0 LOAD_CONST               1 (42)
                  2 RETURN_VALUE
    

    Two more (fast) instructions are used in the first: STORE_FAST and LOAD_FAST. These make a quick store and grab of the value in the fastlocals array of the current execution frame. Then, in both cases, a RETURN_VALUE is performed. So, the second is ever so slightly faster due to less commands needed to execute.

    In general, be aware that the CPython compiler is conservative in the optimizations it performs. It isn't and doesn't try to be as smart as other compilers (which, in general, also have much more information to work with). The main design goal, apart from obviously being correct, is to a) keep it simple and b) be as swift as possible in compiling these so you don't even notice that a compilation phase exists.

    In the end, you shouldn't trouble yourself with small issues like this one. The benefit in speed is tiny, constant and, dwarfed by the overhead introduced by the fact that Python is interpreted.

    *dis is a little Python module that dis-assembles your code, you can use it to see the Python bytecode that the VM will execute.

    Note: As also stated in a comment by @Jorn Vernee, this is specific to the CPython implementation of Python. Other implementations might do more aggressive optimizations if they so desire, CPython doesn't.