Search code examples
gccassemblyx86-64ssemagic-numbers

Weird SSE assembler instructions for double negation


GCC and Clang compilers seem to employ some dark magic. The C code just negates the value of a double, but the assembler instructions involve bit-wise XOR and the instruction pointer. Can somebody explain what is happening and why is it an optimal solution. Thank you.

Contents of test.c:

void function(double *a, double *b) {
    *a = -(*b); // This line.
}

The resulting assembler instructions:

(gcc)
0000000000000000 <function>:
 0: f2 0f 10 06             movsd  xmm0,QWORD PTR [rsi]
 4: 66 0f 57 05 00 00 00    xorpd  xmm0,XMMWORD PTR [rip+0x0]        # c <function+0xc>
 b: 00 
 c: f2 0f 11 07             movsd  QWORD PTR [rdi],xmm0
10: c3                      ret 

(clang)
0000000000000000 <function>:
 0: f2 0f 10 06             movsd  xmm0,QWORD PTR [rsi]
 4: 0f 57 05 00 00 00 00    xorps  xmm0,XMMWORD PTR [rip+0x0]        # b <function+0xb>
 b: 0f 13 07                movlps QWORD PTR [rdi],xmm0
 e: c3                      ret    

The assembler instruction at address 0x4 represents "This line", however I can't understand how it works. The xorpd/xorps instructions are supposed to be bit-wise XOR and PTR [rip] is the instruction pointer.

I suspect that at the moment of execution rip is pointing somewhere near the 0f 57 05 00 00 00 0f strip of bytes, but I can't quite figure out, how is this working and why do both compilers choose this approach.

P.S. I should point out that this is compiled using -O3


Solution

  • for me the output of gcc with the -S -O3 options for the same code is:

        .file   "test.c"
        .text
        .p2align 4,,15
        .globl  function
        .type   function, @function
    function:
    .LFB0:
        .cfi_startproc
        movsd   (%rsi), %xmm0
        xorpd   .LC0(%rip), %xmm0
        movsd   %xmm0, (%rdi)
        ret
        .cfi_endproc
    .LFE0:
        .size   function, .-function
        .section    .rodata.cst16,"aM",@progbits,16
        .align 16
    .LC0:
        .long   0
        .long   -2147483648
        .long   0
        .long   0
        .ident  "GCC: (Ubuntu 6.3.0-12ubuntu2) 6.3.0 20170406"
        .section    .note.GNU-stack,"",@progbits
    

    here the xorpd instruction uses instruction pointer relative addressing with the offset which points to .LC0 label with the 64 bit value 0x8000000000000000(the 63rd bit is set to one).

    .LC0:
        .long   0
        .long   -2147483648
    

    if your compiler was big endian these lines where swaped.

    xoring the double value with 0x8000000000000000 sets the sign bit(which is the 63rd bit) to one for a negative value.

    clang uses xorps instruction for the same manner this xors the first 32bit of the double value.

    if you run object dump with -r option it will show you the relocations that should be done on the program before running it.

    objdump -d test.o -r

    test.o:     file format elf64-x86-64
    
    
    Disassembly of section .text:
    
    0000000000000000 <function>:
       0:   f2 0f 10 06             movsd  (%rsi),%xmm0
       4:   66 0f 57 05 00 00 00    xorpd  0x0(%rip),%xmm0        # c <function+0xc>
       b:   00 
                8: R_X86_64_PC32    .LC0-0x4
       c:   f2 0f 11 07             movsd  %xmm0,(%rdi)
      10:   c3                      retq   
    
    Disassembly of section .text.startup:
    
    0000000000000000 <main>:
       0:   31 c0                   xor    %eax,%eax
       2:   c3                      retq   
    

    here at <function + 0xb> we have a relocation of type R_X86_64_PC32.

    PS: I'm using gcc 6.3.0