I am reading this book by Fedor Pikus and he has some very very interesting examples which for me were a surprise.
Particularly this benchmark caught me, where the only difference is that in one of them we use || in if and in another we use |.
void BM_misspredict(benchmark::State& state)
{
std::srand(1);
const unsigned int N = 10000;;
std::vector<unsigned long> v1(N), v2(N);
std::vector<int> c1(N), c2(N);
for (int i = 0; i < N; ++i)
{
v1[i] = rand();
v2[i] = rand();
c1[i] = rand() & 0x1;
c2[i] = !c1[i];
}
unsigned long* p1 = v1.data();
unsigned long* p2 = v2.data();
int* b1 = c1.data();
int* b2 = c2.data();
for (auto _ : state)
{
unsigned long a1 = 0, a2 = 0;
for (size_t i = 0; i < N; ++i)
{
if (b1[i] || b2[i]) // Only difference
{
a1 += p1[i];
}
else
{
a2 *= p2[i];
}
}
benchmark::DoNotOptimize(a1);
benchmark::DoNotOptimize(a2);
benchmark::ClobberMemory();
}
state.SetItemsProcessed(state.iterations());
}
void BM_predict(benchmark::State& state)
{
std::srand(1);
const unsigned int N = 10000;;
std::vector<unsigned long> v1(N), v2(N);
std::vector<int> c1(N), c2(N);
for (int i = 0; i < N; ++i)
{
v1[i] = rand();
v2[i] = rand();
c1[i] = rand() & 0x1;
c2[i] = !c1[i];
}
unsigned long* p1 = v1.data();
unsigned long* p2 = v2.data();
int* b1 = c1.data();
int* b2 = c2.data();
for (auto _ : state)
{
unsigned long a1 = 0, a2 = 0;
for (size_t i = 0; i < N; ++i)
{
if (b1[i] | b2[i]) // Only difference
{
a1 += p1[i];
}
else
{
a2 *= p2[i];
}
}
benchmark::DoNotOptimize(a1);
benchmark::DoNotOptimize(a2);
benchmark::ClobberMemory();
}
state.SetItemsProcessed(state.iterations());
}
I will not go in all the details explained in the book why the latter is faster, but the idea is that hardware branch predictor is given 2 chances to misspredict in slower version and in the | (bitwise or) version. See the benchmark results below.
So the question is why don't we always use | instead of || in branches?
Is
if(A | B)
always faster thanif(A || B)
?
No, if(A | B)
is not always faster than if(A || B)
.
Consider a case where A
is true and the B
expression is a very expensive operation. Not doing the operation can save that expense.
So the question is why don't we always use | instead of || in branches?
Besides the cases where the logical or is more efficient, the efficiency is not the only concern. There are often operations that have pre-conditions, and there are cases where the result of the left hand operation signals whether the pre-condition is satisfied for the right hand operation. In such case, we must use the logical operator.
if (b1[i]) // maybe this exists somewhere in the program
b2 = nullptr;
if(b1[i] || b2[i]) // OK
if(b1[i] | b2[i]) // NOT OK; indirection through null pointer
It is this possibility that is typically the problem when the optimiser is unable to replace logical with bitwise. In the example of if(b1[i] || b2[i])
, the optimiser can do such replacement only if it can prove that b2
is valid at least when b1[i] != 0
. That condition might not exist in your example, but that doesn't mean that it would necessarily be easy or - sometimes even possible - for the optimiser to prove that it doesn't exist.
Furthermore, there can be a dependency on the order of the operations, for example if one operand modifies a value read by the other operation:
if(a || (a = b)) // OK
if(a | (a = b)) // NOT OK; undefined behaviour
Also, there are types that are convertible to bool and thus are valid operands for ||
, but are not valid operators for |
:
if(ptr1 || ptr2) // OK
if(ptr1 | ptr2) // NOT OK; no bitwise or for pointers
TL;DR If we could always use bitwise or instead of logical operators, then there would be no need for logical operators and they probably wouldn't be in the language. But such replacement is not always a possibility, which is the reason why we use logical operators, and also the reason why optimiser sometimes cannot use the faster option.