On my gcc-4.8.1, I've compiled the following program with two commands:
g++ -Wfatal-errors -std=c++11 -Wall -Werror test.cpp -o test -g
g++ -Wfatal-errors -std=c++11 -Wall -Werror test.cpp -o test -O3 -g
The first executable has the expected output, but the second one segfaults. The problem is that it's hard to debug because -O3
messes with the code too much for the -g
debug information to retain meaning, so gdb
has trouble translating what's going on in source code. So, I started inserting print statements instead. As I expected, print statements change the result. With debug prints, it works just fine!
Here is my expression template source:
//test.cpp
#include <vector>
#include <stdlib.h>
#include <iostream>
using namespace std;
typedef vector<int> Valarray;
template<typename L, typename R>
struct BinOpPlus {
const L& left;
const R& right;
BinOpPlus(const L& l, const R& r)
: left(l), right(r)
{}
int operator[](int i) const {
int l = left[i];
//cerr << "Left: " << l << endl; //uncomment to fix segfault
int r = right[i];
//cerr << "Right: " << r << endl; //uncomment to fix segfault
return l + r;
}
};
template<typename L, typename R>
BinOpPlus<L, R> operator+(const L& left, const R& right){
return BinOpPlus<L, R>(left, right);
}
int main() {
//int size = 10000000;
int size = 10;
Valarray v[3];
for(int n=0; n<3; ++n){
for(int i=0; i<size; ++i){
int val = rand() % 100;
v[n].push_back(val);
}
}
auto out = v[0] + v[1] + v[2];
int sum = 0;
for(int i=0; i<size; ++i){
cerr << "Checkpoint!" << endl;
sum += out[i]; //segfaults here
cerr << "Sum: " << sum << endl;
}
cout << "Sum: " << sum << endl;
return 0;
}
It's been a long time since -O3
has given me an incorrect/unreliable binary. I am first assuming that I did something wrong in my code, but not wrong enough for -O0
to show it. Anyone have any ideas what I'm doing wrong?
In this line
auto out = v[0] + v[1] + v[2];
The type of out
is BinOpPlus< BinOpPlus<ValArray, ValArray>, Valarray>
. Since your BinOpPlus
stores references to its arguments, and the BinOpPlus<ValArray,ValArray>
there is a temporary, you have undefined behavior.
Usually expression templates like these use a trait to decide how to store their arguments, so that you can store actual objects by reference (and assume that the user will not mess up) and other ETs by value (they are very small anyway).
Also using auto
with arithmetic ETs is considered at least problematic, because it rarely produce the intended type. For this very reason there have been a couple of proposal to introduce a sort of operator auto
to customize the type deduced by auto in ETs.