Search code examples
pythonoperatorstranslate

What does the power operator (**) in Python translate into?


In other words, what exists behind the two asterisks? Is it simply multiplying the number x times or something else?

As a follow-up question, is it better to write 2**3 or 2*2*2. I'm asking because I've heard that in C++ it's better to not use pow() for simple calculations, since it calls a function.


Solution

  • If you're interested in the internals, I'd disassemble the instruction to get the CPython bytecode it maps to. Using Python3:

    »»» def test():
        return 2**3
       ...: 
    »»» dis.dis(test)
      2           0 LOAD_CONST               3 (8)
                  3 RETURN_VALUE
    

    OK, so that seems to have done the calculation right on entry, and stored the result. You get exactly the same CPython bytecode for 2*2*2 (feel free to try it). So, for the expressions that evaluate to a constant, you get the same result and it doesn't matter.

    What if you want the power of a variable?

    Now you get two different bits of bytecode:

    »»» def test(n):
            return n ** 3
    
    »»» dis.dis(test)
      2           0 LOAD_FAST                0 (n)
                  3 LOAD_CONST               1 (3)
                  6 BINARY_POWER
                  7 RETURN_VALUE
    

    vs.

    »»» def test(n):
        return n * 2 * 2
       ....: 
    
    »»» dis.dis(test)
      2           0 LOAD_FAST                0 (n)
                  3 LOAD_CONST               1 (2)
                  6 BINARY_MULTIPLY
                  7 LOAD_CONST               1 (2)
                 10 BINARY_MULTIPLY
                 11 RETURN_VALUE
    

    Now the question is of course, is the BINARY_MULTIPLY quicker than the BINARY_POWER operation?

    The best way to try that is to use timeit. I'll use the IPython %timeit magic. Here's the output for multiplication:

    %timeit test(100)
    The slowest run took 15.52 times longer than the fastest. This could mean that an intermediate result is being cached 
    10000000 loops, best of 3: 163 ns per loop
    

    and for power

    The slowest run took 5.44 times longer than the fastest. This could mean that an intermediate result is being cached 
    1000000 loops, best of 3: 473 ns per loop
    

    You may wish to repeat this for representative inputs, but empirically it looks like the multiplication is quicker (but note the mentioned caveat about the variance in the output).

    If you want further internals, I'd suggest digging into the CPython code.