I am programming C on cygwin windows. After having done a bit of C programming and getting comfortable with the language, I wanted to look under the hood and see what the compiler is doing for the code that I write.
So I wrote down a code block containing switch case statements and converted them into assembly using:
gcc -S foo.c
Here is the C source:
switch(i)
{
case 1:
{
printf("Case 1\n");
break;
}
case 2:
{ printf("Case 2\n");
break;
}
case 3:
{
printf("Case 3\n");
break;
}
case 4:
{
printf("Case 4\n");
break;
}
case 5:
{
printf("Case 5\n");
break;
}
case 6:
{
printf("Case 6\n");
break;
}
case 7:
{
printf("Case 7\n");
break;
}
case 8:
{
printf("Case 8\n");
break;
}
case 9:
{
printf("Case 9\n");
break;
}
case 10:
{
printf("Case 10\n");
break;
}
default:
{
printf("Nothing\n");
break;
}
}
Now the resultant assembly for the same is:
movl $5, -4(%ebp)
cmpl $10, -4(%ebp)
ja L13
movl -4(%ebp), %eax
sall $2, %eax
movl L14(%eax), %eax
jmp *%eax
.section .rdata,"dr"
.align 4
L14:
.long L13
.long L3
.long L4
.long L5
.long L6
.long L7
.long L8
.long L9
.long L10
.long L11
.long L12
.text
L3:
movl $LC0, (%esp)
call _printf
jmp L2
L4:
movl $LC1, (%esp)
call _printf
jmp L2
L5:
movl $LC2, (%esp)
call _printf
jmp L2
L6:
movl $LC3, (%esp)
call _printf
jmp L2
L7:
movl $LC4, (%esp)
call _printf
jmp L2
L8:
movl $LC5, (%esp)
call _printf
jmp L2
L9:
movl $LC6, (%esp)
call _printf
jmp L2
L10:
movl $LC7, (%esp)
call _printf
jmp L2
L11:
movl $LC8, (%esp)
call _printf
jmp L2
L12:
movl $LC9, (%esp)
call _printf
jmp L2
L13:
movl $LC10, (%esp)
call _printf
L2:
Now, in the assembly, the code is first checking the last case (i.e. case 10) first. This is very strange. And then it is copying 'i' into 'eax' and doing things that are beyond me.
I have heard that the compiler implements some jump table for switch..case. Is it what this code is doing? Or what is it doing and why? Because in case of less number of cases, the code is pretty similar to that generated for if...else ladder, but when number of cases increases, this unusual-looking implementation is seen.
Thanks in advance.
First the code is comparing the i to 10 and jumping to the default case when the value is greater then 10 (cmpl $10, -4(%ebp)
followed by ja L13
).
The next bit of code is shifting the input to the left by two (sall $2, %eax
) which is the same as multiple by four which generates an offset into the jump table (because each entry in the table is 4 bytes long)
It then loads an address from the jump table (movl L14(%eax), %eax
) and jumps to it (jmp *%eax
).
The jump table is simply a list of addresses (represented in assembly code by labels):
L14:
.long L13
.long L3
.long L4
...
One thing to notice is that L13
represents the default case. It is both the first entry in the jump table (for when i is 0) and is handled specially at the beginning (when i > 10).