Search code examples
cpointersgccpointer-arithmetic

Invalid optimization for pointer comparison in GCC? (introduced in GCC 7.1)


I have a puzzling problem regarding optimization and pointer equality that seems to happen only in GCC from version 7.1 and higher until the latest GCC version.

The Problem can be shown with one simple example:

#include <stdio.h>
#include <stddef.h>

int main() {
    int test1[3] = {0};
    int test2[3] = {0};
    ptrdiff_t diff = test2 - test1;
    int *test3 = test1 + diff;

    printf("%p == %p = %d\n", test1 + diff, test2, test1 + diff == test2);
    printf("%p == %p = %d\n", test1 + diff + 2, test2 + 2, test1 + diff + 2 == test2 + 2);
    printf("%p == %p = %d\n", test3 + 2, test2 + 2, test3 + 2 == test2 + 2);

    printf("\n\n");

    printf("%p == %p = %d\n", &(test1 + diff)[0], &test2[0], &(test1 + diff)[0] == &test2[0]);
    printf("%p == %p = %d\n", &(test1 + diff)[2], &test2[2], &(test1 + diff)[2] == &test2[2]);
    printf("%p == %p = %d\n", &test3[2], &test2[2], &test3[2] == &test2[2]);
    
    return 0;
}

The code performs pointer arithmetic between the two arrays. The goal is to find the Element in array2 at the same offset in array1. diff is the offset of the first elements of the arrays in memory. The problem only occurs with optimization set to -O1 or higher.

The expected output would be for all checks to be = 1 however this program will print

0x7fff4aa00e78 == 0x7fff4aa00e78 = 1
0x7fff4aa00e80 == 0x7fff4aa00e80 = 0
0x7fff4aa00e80 == 0x7fff4aa00e80 = 1


0x7fff4aa00e78 == 0x7fff4aa00e78 = 1
0x7fff4aa00e80 == 0x7fff4aa00e80 = 0
0x7fff4aa00e80 == 0x7fff4aa00e80 = 1

If I perform the check of the pointer to the first element directly it shows as equal but if i try to compare the nth element in the array the comparison shows as false, even though the address is clearly the same.

More interesting even, if i store the result of the arithmetic operation in a new pointer and compare the array elements from there, it shows up as equal again.

I have searched far and wide and have not found a suitable explanation for this problem other than gcc performing an invalid optimization here. The closest i could find was described in this Blogpost http://kristerw.blogspot.com/2016/12/pointer-comparison-invalid-optimization.html. However the problem in the Blog targets the "one element after" specifically which is not the case in my problem.

I could find no other compiler that exhibits this exact problem (tested with several MSVC and clang versions). Compiling with GCC 6.5 shows the correct output.


Solution

  • Pointer arithmetic in C has always only been allowed to get carried out inside an array. Where single variables like structs etc (scalars) can get treated like an array of just one single item.

    Doing pointer arithmetic outside the bounds of an array is undefined behavior - anything can happen. And as we can see from your example, this appears to be the perfect example of "anything happens". What is undefined behavior and how does it work?

    Specifically, C23 6.5.7 (additive operators) says:

    If the pointer operand and the result do not point to elements of the same array object or one past the last element of the array object, the behavior is undefined.


    Update:

    Trying to explain why one particular form of undefined behavior happened is a rather uninteresting exercise from which not much of value is learnt, but anyway... here goes:

    Specifically, what seems to happen (for me) when optimizations are disabled -O0 is that the arithmetic actually gets carried out in the generated machine code:

        lea     rax, [rbp-40]
        lea     rdx, [rbp-28]
        sub     rax, rdx
    

    That is, two addresses are loaded from an address relative to the stack pointer, then subtracted - as you expect would happen.

    Enabling optimizations, something different happens. For the first printf line it loads the address of the first argument then copies it to the second - the compiler assuming they are the same:

        lea     rsi, [rsp+4]
        mov     rdx, rsi
    

    There is no subtraction in run-time. Instead the compiler pre-calculates the == comparison of the addresses at compile-time as a 1:

        mov     ecx, 1
    

    In the second printf it does the same thing (lea + mov) but at the same time the compiler simply decided no - they are not the same:

        xor     ecx, ecx
    

    This is setting ecx to zero for the == output. It's a x86_64 convention to use xor when setting a register to zero since it's a fast instruction.

    Why it decided to do that, I don't know. Maybe since it is free to come to any conclusion it likes because of undefined behavior, it went with xor since that's maybe a clock tick faster than mov or something like that.

    The parameter shuffling of the first two printf parameters is probably detached from the compile-time calculation of == and that's why the compiler decided "these are the same value" and then in the next line "but they aren't equivalent".