Search code examples
c++compiler-optimizationcritical-section

-O3 loop increment optimization


I have this piece of code:

#include <iostream>
#include <thread>

long int global_variable;

struct process{
    long int loop_times_ = 0;
    bool op_;
    process(long int loop_times, bool op): loop_times_(loop_times), op_(op){}

    void run(){
        for(long int i=0; i<loop_times_; i++)
            if (op_) global_variable+=1;
            else global_variable-=1;
    }

};

int main(){
    struct process p1(10000000, true);
    struct process p2(10000000, false);

    std::thread t1(&process::run, p1);
    std::thread t2(&process::run, p2);
    t1.join();
    t2.join();

    std::cout <<global_variable<< std::endl;
    return 0;
}

Main function fires up two threads that increment and decrement a global variable. If i compile with this:

 g++ -std=c++11 -o main main.cpp -lpthread

i get different output in each execution. But if i add -O3 and compile with this:

g++ -O3 -std=c++11 -o main main.cpp -lpthread

the output is zero every time

What kind of optimization is happening here that eliminates my critical section, and how can i trick the compiler to not optimize it?

EDIT: OS: Ubuntu 16.04.4, g++: 5.4.0


Solution

  • It's very likely that your run method is being optimized to the equivalent of:

     void run(){
          if (op_) global_variable += loop_times_;
                else global_variable -= loop_times_;
    

    This is something the compiler can do quite easily with the information available.

    To trick the compiler, you have to make sure that it's not obvious that the loop will add or subtract 1 with no other side effects on every iteration.

    Try adding a function call into the loop, that just increments a simple counter on the object called totalIterationsDone, or some such. This might force the compiler into actually executing the loop. Passing in your loop variable as an argument might also force it to keep track of intermediate values of i.

    struct process{
        long int loop_times_ = 0;
        bool op_;
        long int _iterationsDone = 0;
        process(long int loop_times, bool op): loop_times_(loop_times), op_(op){}
    
        void run(){
            for(long int i=0; i<loop_times_; i++){
                if (op_) global_variable+=1;
                else global_variable-=1;
                Trick(i);
            }
        }
    
        void Trick(int i){
           _iterationsDone += 1;
        }    
    };