Lately, I've started exploring compilation process of Python source code. While exploring, I've encountered confusing results.
import dis
def f():
x = 1
return x
dis.dis(f)
Output:
4 0 RESUME 0
5 2 LOAD_CONST 1 (1)
4 STORE_FAST 0 (x)
6 6 LOAD_FAST 0 (x)
8 RETURN_VALUE
In the example above, I created a function which contains the instructions below.
Create a variable and assign it to integer object whose value is one. Notice here, the interpreter didn't create the integer object since it was already created at the beginning of the program (optimization) and it is a singleton object, thus there can be only one integer object whose value is one.
Return the object.
So far there is no problem. This one on the other hand confused me.
import dis
import sys
print(sys.getrefcount(10 ** 32))
def f():
x = 10 ** 32
return x
dis.dis(f)
Output:
4
7 0 RESUME 0
8 2 LOAD_CONST 1 (100000000000000000000000000000000)
4 STORE_FAST 0 (x)
9 6 LOAD_FAST 0 (x)
8 RETURN_VALUE
getrefcount returning more than one means this object is created at the start of the program.
The questions are:
Why isn't the arithmetic operation instruction visible when function f's byte code is disassembled?
Let's say that the reason behind is that interpreter creates the object at the beginning of the program and it simply loaded that object.
If that is the reason, then, how does the interpreter know that the result of 10**32 operation is already created before executing the operation, hence receiving the result?
And, why such a high number is created at the beginning of the program? The Python documentation states that integers in range of -5 to 256 are created at the start of the program. Obviously, 10 ** 32 is outside of this range.
import dis
import sys
print(sys.getrefcount(10 ** 33))
def f():
x = 10 ** 33
return x
dis.dis(f)
Output:
1
7 0 RESUME 0
8 2 LOAD_CONST 1 (10)
4 LOAD_CONST 2 (33)
6 BINARY_OP 8 (**)
10 STORE_FAST 0 (x)
9 12 LOAD_FAST 0 (x)
14 RETURN_VALUE
That means the interpreter didn't create this object at the start of the program, thus we can say the upper limit is something close to 10 ** 33 (contradicting what is written in the documentation).
I expect arithmetic operation to be visible in the second example as it is visible in third example, but it is not.
I expect integer numbers higher than 256 not to be created at the start of the program according to documentations, but they are created and actually the upper limit is way more higher.
I've found out that Python implements an operation called "Constant Folding" which is responsible for the difference between two last examples, and since it is explained in different sources way more better than I can, I will simply share the links to those sources.
Here is a video I found out the reason => https://youtu.be/HVUTjQzESeo?t=1747
Another resource here => What are the specific rules for constant folding?