Search code examples
clangbytecodeebpf

How does Clang produce this eBPF bytecode?


I'm trying to understand eBPF using some simple programs. I've got this program that I've compiled with clang -target bpf -Wall -O2 -c bpf.c -o bpf.o:

long loopy(long x)
{
    for (; x > 0; x += 3)
        ;
    return x;
}

and the assembly (llvm-objdump -d bpf.o) is:

bpf.o:  file format elf64-bpf

Disassembly of section .text:

0000000000000000 <loopy>:
       0:       bf 10 00 00 00 00 00 00 r0 = r1
       1:       07 00 00 00 fd ff ff ff r0 += -0x3

0000000000000010 <LBB0_1>:
       2:       07 00 00 00 03 00 00 00 r0 += 0x3
       3:       65 01 fe ff 00 00 00 00 if r1 s> 0x0 goto -0x2 <LBB0_1>
       4:       95 00 00 00 00 00 00 00 exit

What I don't understand is output line 3: why is r1 tested when r0 is the variable that is being changed? This line would make sense to me if the test were r0 s> 0x0. As is, I expect that the loop would never terminate. Am I missing something here, or is this bad output from Clang?

As an aside, I realize this program would not pass the Linux BPF verifier, but I still expect accurate bytecode from Clang.


Solution

  • TL;DR. The generated bytecode is correct because your C code may also never terminate (if x is strictly positive). Good thing the verifier exists :D


    Let's consider the two cases for the function argument's initial value: x is negative or null; x is positive.

    x <= 0

    The C code: It will never enter the loop and simply return the value of x as it was.

    The eBPF bytecode: It will execute instructions 0, 1, 2, therefore doing x-3, then x+3. So before instruction 3, x still has its initial negative or null value. The jump will therefore not be taken and it will reach the exit at instruction 4. The bytecode will therefore return x's initial value, same as the C code.

    x > 0

    The C code: It will enter the loop. Since x will only increase from that point on, it will never exit the loop. Of course, x will overflow at some point, but that's undefined behavior anyway.

    The eBPF bytecode: As you've noticed the eBPF bytecode will also enter the loop and never exit, because the value being tested in the condition is not even the one being incremented. The end result is the same as for the C code: your function will not return.