How does List Comprehension exactly work in Python?

I am going trough the docs of Python 3.X, I have doubt about List Comprehension speed of execution and how it exactly work.

Let's take the following example:

Listing 1

...
L = range(0,10)
L = [x ** 2 for x in L]
...

Now in my knowledge this return a new listing, and it's equivalent to write down:

Listing 2

...
res = []
for x in L:
  res.append(x ** 2)
...

The main difference is the speed of execution if I am correct. Listing 1 is supposed to be performed at C language speed inside the interpreter, meanwhile Listing 2 is not.

But Listing 2 is what the list comprehension does internally (not sure), so why Listing 1 is executed at C Speed inside the interpreter & Listing 2 is not? Both are converted to byte code before being processed, or am I missing something?

Solution

Look at the actual bytecode that is produced. I've put the two fragments of code into fuctions called f1 and f2.

The comprehension does this:

  3          15 LOAD_CONST               3 (<code object <listcomp> at 0x7fbf6c1b59c0, file "<stdin>", line 3>)
             18 LOAD_CONST               4 ('f1.<locals>.<listcomp>')
             21 MAKE_FUNCTION            0
             24 LOAD_FAST                0 (L)
             27 GET_ITER
             28 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             31 STORE_FAST               0 (L)

Notice there is no loop in the bytecode. The loop happens in C.

Now the for loop does this:

  4          21 SETUP_LOOP              31 (to 55)
             24 LOAD_FAST                0 (L)
             27 GET_ITER
        >>   28 FOR_ITER                23 (to 54)
             31 STORE_FAST               2 (x)
             34 LOAD_FAST                1 (res)
             37 LOAD_ATTR                1 (append)
             40 LOAD_FAST                2 (x)
             43 LOAD_CONST               3 (2)
             46 BINARY_POWER
             47 CALL_FUNCTION            1 (1 positional, 0 keyword pair)
             50 POP_TOP
             51 JUMP_ABSOLUTE           28
        >>   54 POP_BLOCK

In contrast to the comprehension, the loop is clearly here in the bytecode. So the loop occurs in python.

The bytecodes are different, and the first should be faster.