My code is as follows and I use GCC 4.8.2:
#include <iostream>
#include <stdint.h>
#include <sys/time.h>
#include <ctime>
using namespace std;
int main(int argc, char *argv[]) {
struct timespec time_start={0, 0},time_end={0, 0};
uint8_t bitmap[20240];
int cost;
clock_gettime(CLOCK_REALTIME, &time_start);
for (int i = 0; i < 20240; ++i) {
bitmap[i >> 3] |= 1 << (i&7);
}
clock_gettime(CLOCK_REALTIME, &time_end);
cost = time_end.tv_nsec - time_start.tv_nsec;
cout << "case COST: " << cost << endl;
clock_gettime(CLOCK_REALTIME, &time_start);
for (int i = 0; i < 20240; ++i) {
bitmap[i >> 3] &= 1 << (i&7);
}
clock_gettime(CLOCK_REALTIME, &time_end);
cost = time_end.tv_nsec - time_start.tv_nsec;
cout << "case COST: " << cost << endl;
int a = bitmap[1];
std::cout << "TEST: " << a << endl;
}
I compile it with
gcc -lstdc++ -std=c++11 -O2 -ftree-vectorize -ftree-vectorizer-verbose=7 -fopt-info test.cpp
and I get test.cpp:14: note: not vectorized: not enough data-refs in basic block.
.
Then I run the binary a.out
and get COST
more than 20000.
If I delete std::cout << "TEST: " << a << endl;
, this code is vectorized and COST
is less than 100.
Anyone can help me.
In the statement
std::cout << "TEST: " << a << endl
You are initializing an ostream
object which involves storage. You also use std::endl
which is not the same as using \n
. When you delete that statement, all this cost is not involved.
The statement before the last cout
is also optimized away (removed) by the compiler because the value of a
is not being used anywhere.
int a = bitmap[1];
Moreover, both the for
loops are optimized away by the compiler because the bitmap
values calculated by both the for
loops will NOT be used when you remove the last cout
statement. And there is no need of the bitmap
array as well.
You can see the assembly that is generated for your code with the compiler version and options you have given here. And you can see clearly what happens when you comment out and uncomment the cout
statement.