Search code examples
clangclang++

How to ask Clang++ not to cache function result during -O3 optimization?


This is my code:

int foo(int x) {
  return x + 1; // I have more complex code here
}
int main() {
  int s = 0;
  for (int i = 0; i < 1000000; ++i) {
    s += foo(42);
  }
}

Without -O3 this code works for a few minutes. With -O3 it returns the same result in no time. Clang++, I believe, caches the value of foo(42) (it's a pure function) and doesn't call it a million times. How can I instruct it NOT to apply this particular optimization for this particular function call?


Solution

  • Out of curiosity, can you share why you would want to disable that optimization?

    Anyway, about your question:

    In your example code, s is never read after the loop, so the compiler would throw the whole loop away. So let's assume that s is used after the loop.

    I'm not aware of any pragmas or compiler options to disable a particular optimization in a particular section of code.

    Is changing the code an option? To prevent that optimization in a portable manner, you can look for a creative way to compute the function call argument in a way such that the compiler is no longer able to treat the argument as constant. Of course the challenge here is to actually use a trick that does not rely on undefined behavior and that cannot be "outsmarted" by a newer compiler version.

    See the commented example below.

    • pro: you use a trick that uses only the language that you can apply selectively
    • con: you get an additional memory access in every loop iteration; however, the access will be satisfied by your CPU cache most of the time

    I verified the generated assembly for your particular example with clang++ -O3 -S. The compiler now generates your loop and no longer caches the result. However, the function gets inlined. If you want to prevent that as well, you can declare foo with __attribute__((noinline)), for example.