Series of if vs else-if statements benchmarking

[I have come back to this question many years later and tried to edit it to be hopefully clearer and maybe even a bit helpful to others. I was a clueless teen in high-school when originally asking this.]

I was testing the inefficiency of series of if statements vs else-if statements and got very surprised by the results.

The code iterates over a big ol' array (100000000 of elements) and counts occurrences of numbers (1-5).

// init the array with some numbers
// init the counters

clock_t time_if = clock();
for (int i = 0; i < n; i++) {
    if (arr[i] == 1)
        p1++;
    if (arr[i] == 2)
        p2++;
    // ... same for 3, 4, 5
}
time_if = clock() - time_if;

printf("count of 1s:\t %d\n",p1);
// ...
printf("---  counting took: %.10f ms---\n",((double)(time_if)/CLOCKS_PER_SEC)*1000);

and then the same but with else-if

clock_t time_if_else = clock();
for (int i = 0; i < n; i++) {
    if (arr[i] == 1)
        p1++;
    else if (arr[i] == 2)
        p2++;
    else if (arr[i] == 3)
        p3++;
    // ... same for 4, 5
}
time_if_else = clock() - time_if_else;

As expected when given an array of random values from 1 to 5 the else-if is faster (about two times) because once an if-else branch is matched (condition is true) the rest of the nested branches are skipped. On the other hand, a series of ifs has no nesting and every condition is evaluated.

BUT when I passed in an array of only 1s, the if series was faster than before with random numbers. The same was true when I passed only 5s.

So I asked: can anyone explain why ifs are faster with only 1s than with random numbers?

And the accepted answer points to branch prediction in modern CPUs and compiler optimization. I had not heard of either at that point in time so this was a valuable lesson for me. Hope anyone else stumbling upon this might also learn something.

Here are the times:

    if      - random input: 2451 ms
    if-else - random input: 2401 ms

    if      - all 1s input: 932 ms
    if-else - all 1s input: 573 ms

    if      - all 5s input: 923 ms
    if-else - all 5s input: 697 ms

Solution

Modern CPUs are very complicated beasts.

I guess you are more benchmarking CPU branch predictor rather then anything else: If you fill the array with random numbers from the range 1...5, branch predictors will quite often guess wrong when any of the if is executed.

If you fill the array with constants, the predictors will be right in 100%.

I.e. with randomized input, the count of evaluated ifs really matters as substantial number of them cause CPU pipeline stall.

With the constant input, the number of executed ifs is more or less negligible, and there are no pipeline stalls. All what matters is some CPU internals and also how well your compiler is able to optimize your code. Smart compiler might be able to effectively replace one of the two or both of your examples into switch statement, so without looking at the instructions generated we can just speculate.