Search code examples
cgcccompiler-constructionclangllvm

Why do compilers perform aliasing if it slows runtime performance?


I've been learning C and computer science topics out of pure interest and it's led me to becoming interested in compilers. Everything I've read tells me that aliasing results in slower assembly output that needs to reload values upon every iteration.

I've been able to get a slight increase on some benchmarks with the Intel C / C++ compiler using the flag -fno-alias. GCC and Clang / LLVM do not have an equivalent flag. There is -fargument-noalias and -fargument-noalias-global which I am guessing disables aliasing on function arguments — please correct me if I'm wrong — but I have not noticed it making any difference in runtime or even compile-time performance.

Does the statement

Aliasing affects performance by preventing the compiler from doing certain optimizations.

always hold true, and what are the benefits of compiler aliasing? Why is it a feature of modern compilers that cannot be easily turned off with a flag except for the Intel compiler? Will compiling without aliasing result in code non-conformant to ABI standards?


Solution

  • Compilers don't perform aliasing. Your program might perform aliasing.

    The compiler either allows you to perform aliasing, or assumes you didn't and then writes machine code that doesn't work if you did aliasing.

    For example, you write code like this:

    void f(int *x, int *y, int *z) {
        *y += *x;
        *z += *x;
    }
    

    The function just adds *x to *y and *z, right? So the assembly code should be like this, right? (I wrote the assembly code in pseudo-C)

    eax = argument x
    ebx = argument y
    ecx = *eax
    *ebx += ecx
    ebx = argument z
    *ebx += ecx
    

    Wrong. If y and x point to the same place, this assembly code doesn't work the same as the C code because it uses the old *x. To work right it would have to be like this:

    eax = argument x
    ebx = argument y
    ecx = *eax
    *ebx += ecx
    ebx = argument z
    ecx = *eax // extra instruction
    *ebx += ecx
    

    That's one more instruction and it's only useful if x==y, but the compiler has to put it there just in case. Aliasing means that *x and *y are different names for the same variable. If the compiler assumes that x and y can't be the same pointer (i.e. there isn't aliasing) then it can skip this instruction, so there are less instructions and the program runs faster.

    You can tell the compiler they don't alias by using the word restrict, like this:

    void f(int *restrict x, int *restrict y, int *restrict z) {
        *y += *x;
        *z += *x;
    }
    

    -fno-alias tells the Intel compiler that aliasing can't happen anywhere. This is likely to create bugs somewhere because parameters do alias sometimes - although it might work fine on small programs where none of them do. restrict is more carefully targeted at the specific parameters that you know don't alias.