Search code examples
bpfebpf

ebpf: how to use BPF_FUNC_trace_printk in eBPF assembly program


I have a small socket filter type eBPF program, where I'm trying to print a protocol value read from __sk_buff context:

struct bpf_insn prog[] = {
   BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
   BPF_LDX_MEM(BPF_W, BPF_REG_0, BPF_REG_6, offsetof(struct __sk_buff, protocol)),
   BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4),
   BPF_MOV64_REG(BPF_REG_1, BPF_REG_10),
   BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, -4),
   BPF_MOV64_IMM(BPF_REG_2, 4),
   BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_trace_printk),
   BPF_MOV64_IMM(BPF_REG_0, 0),
   BPF_EXIT_INSN(),
};

...

I create a raw socket and bind it to the lo interface, then setsockopt(fd, SOL_SOCKET, SO_ATTACH_BPF, ...). It compiles and loads with no problems, however whenever I ping 127.0.0.1 I never see traces in the trace_pipe.

So, to make sure that it BPF_FUNC_trace_printk actually can work, I changed it so that it prints a static string on the stack, and it does print on every packet hitting the loopback.

What am I doing wrong?


Solution

  • Read the friendly manual :)

    I don't believe you are calling the bpf_trace_printk() helper correctly (BPF_FUNC_trace_prink is just an integer, by the way). Its signature, commented in the kernel UAPI header bpf.h or in the bpf-helpers man page, is as follows:

    long bpf_trace_printk(const char *fmt, u32 fmt_size, ...);
    

    This means that the first argument must be a constant, null-terminated format string, not an integer like you do.

    What does clang do?

    I understand you are attaching your eBPF programs to sockets and cannot compile the whole program from C. However, why not compile that specific part as a generic networking eBPF program to see what the bytecode should look like? Let's write the C code:

    #include <linux/bpf.h>
    
    static long (*bpf_trace_printk)(const char *fmt, __u32 fmt_size, ...) = (void *) BPF_FUNC_trace_printk;
    
    int printk_proto(struct __sk_buff *skb) {
        char fmt[] = "%d\n";
    
        bpf_trace_printk(fmt, sizeof(fmt), skb->protocol);
    
        return 0;
    }
    

    Compile to an object file. For the record this would not load, unless we provide both a valid licence string (because bpf_trace_prink() needs a GPL-compatible program) and a compatible program type at load time. But it does not matter in our case, we just want to look at the generated instructions.

    $ clang -O2 -g -emit-llvm -c prink_protocol.c  -o - | \
            llc -march=bpf -mcpu=probe -filetype=obj -o prink_protocol.o 
    

    Dump the bytecode:

    $ llvm-objdump -d prink_protocol.o 
    
    prink_protocol.o:       file format elf64-bpf
    
    
    Disassembly of section .text:
    
    0000000000000000 <printk_proto>:
           0:       b4 02 00 00 25 64 0a 00 w2 = 680997
           1:       63 2a fc ff 00 00 00 00 *(u32 *)(r10 - 4) = r2
           2:       61 13 10 00 00 00 00 00 r3 = *(u32 *)(r1 + 16)
           3:       bf a1 00 00 00 00 00 00 r1 = r10
           4:       07 01 00 00 fc ff ff ff r1 += -4
           5:       b4 02 00 00 04 00 00 00 w2 = 4
           6:       85 00 00 00 06 00 00 00 call 6
           7:       b4 00 00 00 00 00 00 00 w0 = 0
           8:       95 00 00 00 00 00 00 00 exit
    

    We can see that on the first two instructions, the program writes the format string (in little endian) onto the stack: 680997 is 0x000a6425, \0\nd%. r2 still contains the length for the format string. The protocol value is stored in r3, the third argument for the call to bpf_trace_prink().