Hi I am on my way to optimize a function that is supposed to give me the "next" of something. So far what I have got is
int fun(int a){
const int k = ...;
for(;test_value(a++) != k;);
return a;
}
This was a quick and dirty way to test that my algorithm actually worked but now afterwards I am worried that the loop makes a test for branching on every iteration (if not the compiler is very good at handling it behind the scenes?). Let us say the chance that any a fulfils the test is at most 1/5 and worst cases one in a million but that test_value is just a clock cycle or two. Is there some systematic way I can help my compiler trade all the branches with arithmetics to better utilize the CPU pipelines?
You could "unroll" your loop a bit, something like:
int nomatch = 1;
while( nomatch ){
nomatch = (test_value(a++) != k);
nomatch &&= (test_value(a++) != k);
nomatch &&= (test_value(a++) != k);
nomatch &&= (test_value(a++) != k);
nomatch &&= (test_value(a++) != k);
}
This would yield fewer iterations, and short-circuiting would prevent the evaluations of test_value
once a match has been found.
Like your original code, this assumes that a match will be found at some point.