Search code examples
coptimizationmicrocontrollervolatile

Does the -O0 compiler flag have the same effect as the volatile keyword in C?


When you use the -O0 compiler flag in C, you tell the compiler to avoid any kind of optimization. When you define a variable as volatile, you tell the compiler to avoid optimizing that variable. Can we use the two approaches interchangeably? And if so what are the pros and cons? Below are some pros and cons that I can think of. Are there any more?

Pros:

  • Using the -O0 flag is helpful if we have a big code base inside which the variables that should have been declared as volatile, are not. If the code is showing buggy behavior, instead of going in the code and finding which variables need to be declared as volatile, we can just use the -O0 flag to eliminate the possibility that optimization is causing the problem.

Cons:

  • The -O0 flag will affect the entire code while the volatile keyword only affects a specific variable. If we're working on a small microcontroller for example, this could be a problem since using -O0 may produce a big executable.

Solution

  • The short answer is: the volatile keyword does not mean "do not optimize". It is something completely different. It informs the compiler that the variable may be changed by something which is not visible for the compiler in the normal program flow. For example:

    1. It can be changed by the hardware - usually registers mapped in the memory address space
    2. Can be changed by the function which is never called - for example the interrupt routine
    3. Variable can be changed by another process or hardware - for example shared memory in the multiprocessor / multicore systems

    The volatile variable has to be read from its storage location every time it is used, and saved every time it was changed.

    Here you have an example:

    int foo(volatile int z)
    {
        return z + z + z + z;
    }
    
    int foo1(int z)
    {
        return z + z + z + z;    
    }
    

    and the resulting code (-O0 optimization option)

    foo(int):
      push rbp
      mov rbp, rsp
      mov DWORD PTR [rbp-4], edi
      mov edx, DWORD PTR [rbp-4]
      mov eax, DWORD PTR [rbp-4]
      add edx, eax
      mov eax, DWORD PTR [rbp-4]
      add edx, eax
      mov eax, DWORD PTR [rbp-4]
      add eax, edx
      pop rbp
      ret
    foo1(int):
      push rbp
      mov rbp, rsp
      mov DWORD PTR [rbp-4], edi
      mov eax, DWORD PTR [rbp-4]
      sal eax, 2
      pop rbp
      ret
    

    The difference is obvious I think. The volatile variable is read 4 times, non volatile is read once, then multiplied by 4.

    You can play yourself here: https://godbolt.org/g/RiTU4g

    In the most cases if the program does not run when you turn on the compiler optimization, you have some hidden UBs in your code. You should debug as long as needed to discover all of them. The correctly written program must run at any optimization level.

    Bear in mind that `volatile' does not mean or guarantee the coherency & atomicity.