Search code examples
optimizationcompilationjulia

Why doesn't the Julia compiler optimise this loop?


I know very little about compiling but I was surprised to see that Julia compiler doesn't optimize several processes.

Let's consider Julia (that is a Just-in-Time compiler) and let's consider these two pieces of code that do essentially the same thing.

n=1
@time for i = 1:10^8 n=n+1 end
elapsed time: 3.535394087 seconds (0 bytes allocated)

n=1
@time n=n+10^8
elapsed time: 6.599e-6 seconds (112 bytes allocated)

Why is a modern compiler not able to understand that this long loop will do nothing but adding 10^8 to n?

The following example is even more striking I think

n=1
@time for i = 1:10^9 n=n end
elapsed time: 3.496573198 seconds (0 bytes allocated)

Solution

  • This is a problem when executing in global scope and is connected with optimizing under conditions where types could change, but given the right circumstances, the situation changes. Constraining the evaluation within a function allows the compiler to do much more. Consider the same thing but in a function.

    function f(n::Int64)
       x = 0;
       for i = 1:n
          x = x + 1;
       end
       return x;
    end
    
    
    julia> @time f(100)
    elapsed time: 2.93e-6 seconds (80 bytes allocated)
    100
    
    julia> @time f(Int64(1e11))
    elapsed time: 4.632e-6 seconds (112 bytes allocated)
    100000000000
    

    By checking the compiler output using code_native, you can see that the loop is optimized out

    julia> code_native(f,(Int64,)) 
    
    Source line: 6
        push    RBP
        mov RBP, RSP
        test    RDI, RDI
        jg  L15
        xor EDI, EDI
    Source line: 6
    L15:    mov RAX, RDI 
        pop RBP
        ret