I'm compiling this simple program:
#include <numeric>
int main()
{
int numbers[] = {1, 2, 3, 4, 5};
auto num_numbers = sizeof(numbers)/sizeof(numbers[0]);
return std::accumulate(numbers, numbers + num_numbers, 0);
}
which sums up the integers 1 through 5 and returns that sum (i.e. 15).
I realize std::accumulate
can have a bit of trickery in the implementation, but still, this is pretty straightforward. I'm surprised by what I get when compiling this (on GodBolt), though.
With -O3
, and with C++ being a compilation-time-computation-oriented language, I get the expected:
main:
mov eax, 15
ret
but if I go down to -O2
- still some heavy optimization - not only do I not get this compile-time computation, but I see this strange piece of assembly:
main:
movabs rax, 8589934593
lea rdx, [rsp-40]
mov ecx, 1
mov DWORD PTR [rsp-24], 5
mov QWORD PTR [rsp-40], rax
lea rsi, [rdx+20]
movabs rax, 17179869187
mov QWORD PTR [rsp-32], rax
xor eax, eax
jmp .L3
.L5:
mov ecx, DWORD PTR [rdx]
.L3:
add rdx, 4
add eax, ecx
cmp rdx, rsi
jne .L5
ret
Now .L5
and .L3
I get. The surprising thing are these strange movabs
instructions, to and from rax
. What do they mean and why are they there?
PS - I compiled using GCC 8.2 on an x86_64 with no -march
set. If I add -march=skylake
- the -O3
gets messed up too! Edit: This seems to be a regression in GCC, see my GCC bug report. Thanks @FlorianWeimer!
8589934593 is 0x200000001 in hexadecimal, and 17179869187 is 0x400000003. These two movabs
instructions simply load two int
constants into one 64-bit register each, for initializing the array on the stack. You can disable this GCC optimization using -fno-store-merging
, then you will get something like this at -O2
for the array initialization:
movl $1, -40(%rsp)
…
…
movl $2, -36(%rsp)
…
movl $3, -32(%rsp)
movl $4, -28(%rsp)
movl $5, -24(%rsp)
The lack of optimization to a single constant looks like a GCC regression, by the way. I do not see this with GCC 6.3. It could actually be related to store merging, which I do not think was part of GCC 6.