Search code examples
cswitch-statementbreak

how cases get evaluated in switch statements (C)


I'm corrently learning C, and I'm allready familiar with basic programming concepts I have a question about switch statement for ex in the following code

for(int i =0 ; i<20; i++){

        switch(i){
        case 0: i+=5; /*label 1*/
        case 1: i+=2; /*label 2*/
        case 5: i+=5; /*label 3*/
        default : i+=4; /*label 4*/
        }
        printf("%d\t",i);
    }

the output is 16 21

that means that case at label 1 is executed first, then since there is no break, label 2, 3 and 4 are also executed, the question is that if label 1 is executed then value of i is updated to 5, does other cases check the condition first (if i =1 or 5 ) then execute or it just execute anything without checking?


Solution

  • It's a very good question, and actually reveals the internals of the switch statement in C and C++, which can sometimes be confused with cascading if-else statements.

    The switch statement in C/C++ works as follows:

    • (1) first it evaluates the expression presented as a condition in the switch statement
    • (2) stores the result on the stack or using a general-purpose register
    • (3) using that result it attempts to jump to the corresponding case statement with the minimum comparisons possible by using a jump-table (when one can be built).

    It is because of (1) and (2) that the switch you created is not behaving the way you may expect, and it doesn't reevaluate the initial expression during the execution of the case statements.

    In contrast with cascading if-else statements, your case statements are essentially blocks of instructions compiled in sequential order, referenced by a jump table as mentioned at (3). Once the execution reaches a case statement, it will automatically cascade over the next case statements if break is not encountered. The break actually instructs the compiler to jump over the switch statement and stop executing the case statements.

    Check out this commented disassembly of your switch statement, just to have a better grip of what's happening under the hood:

       0x56555585 <+56>:    mov    -0x10(%ebp),%eax       ;<--- store "i" (the switch condition) into EAX
       0x56555588 <+59>:    cmp    $0x1,%eax              ;<--- check "case 1"
       0x5655558b <+62>:    je     0x5655559a <main+77>   ;<--- jump if equal to "case 1"
       0x5655558d <+64>:    cmp    $0x5,%eax              ;<--- check "case 5"
       0x56555590 <+67>:    je     0x5655559e <main+81>   ;<--- jump if equal to "case 5" 
       0x56555592 <+69>:    test   %eax,%eax              ;<--- check "case 0"
       0x56555594 <+71>:    jne    0x565555a2 <main+85>   ;<--- jump if not equal to "default"
       0x56555596 <+73>:    addl   $0x5,-0x10(%ebp)       ;<--- case 0
       0x5655559a <+77>:    addl   $0x2,-0x10(%ebp)       ;<--- case 1
       0x5655559e <+81>:    addl   $0x5,-0x10(%ebp)       ;<--- case 5
       0x565555a2 <+85>:    addl   $0x4,-0x10(%ebp)       ;<--- default
    

    Note: this is built with -m32 -O0 gcc options to use 32bit code which is much easier to read, and disable optimizations.

    You can clearly see that after the jump is made (to any case statement) there is no further reevaluation of i (-0x10(%ebp)). Also, when the case is executed, it automatically cascades to the next one if no break is used.

    Now, you may ask yourself why this odd behavior and the answer is at (3): to jump to the corresponding case statement with the minimum comparisons possible.

    The switch statements in C/C++ show their true strength when the number of case statements really scales up and especially when the spread between the values used for the case statements is constant.

    For example, let's assume we have a large switch statement with 100 case values, with a constant spread of 1 between case values and that the switch expression (i) evaluates to 100 (last case in the switch):

    switch (i) {
     case 1: /*code for case 1*/ break;
     case 2: /*code for case 2*/ break;
     [...]
     case 99: /*code for case 99*/ break;
     case 100: /*code for case 100*/ break;
    }
    

    If you used cascading if-else statements you would get 100 comparisons, but this switch can obtain the same result using just a couple of instructions, in this order:

    • first: the compiler will index all the case statements in a jump table
    • second: it will evaluate the condition in the switch and store the result (i.e.: fetch i)
    • third: it calculates the corresponding index in the jump table based on the result (i.e.: decrement i by 1, the first case statement, results in index 99)
    • fourth, it jumps directly to the corresponding case without any further operation

    The same will apply if your case values have a spread of 2:

    switch (i) {
     case 1: /*code for case 1*/ break;
     case 3: /*code for case 3*/ break;
     [...]
     case 99: /*code for case 99*/ break;
     case 101: /*code for case 101*/ break;
    }
    

    Your compiler should detect this spread too and after subtracting the first case value (which is 1) will divide by 2 to obtain the same index for the jump table.

    This complicated inner-workings of the switch statement makes it a very powerful tool in C/C++ when you want to branch your code based on a value you can only evaluate at run-time, and when that value belongs to a set that is evenly spread, or at least, groups of values with an even spread.

    When the case values don't have an even spread, the switch becomes less efficient and it starts to perform similarly to if we have used cascading if-else instead.