Search code examples
bpfebpf

BPF verifier rejects when try to access __sk_buff member


I'm trying to write a sample eBPF program which can access __sk_buff member and dump it into /sys/kernel/debug/tracing/trace.

#include <uapi/linux/bpf.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_endian.h>

SEC("dump_skb_member")
int test_prog(struct __sk_buff *skb)
{
    char fmt[] = "packet: local %u remote %u\n";
    __u32 local_ip4 = bpf_htonl(skb->local_ip4);
    __u32 remote_ip4 = bpf_htonl(skb->remote_ip4);
    bpf_trace_printk(fmt, sizeof(fmt), local_ip4, remote_ip4);

    return BPF_OK;
}


char _license[] SEC("license") = "GPL";

When i compile this code, and load this program

ip route add 192.168.56.104 encap bpf out obj sample.o section dump_skb_member dev enp0s8

An error is thrown.

Prog section 'dump_skb_member' rejected: Permission denied (13)!
 - Type:         11
 - Instructions: 21 (0 over limit)
 - License:      GPL

Verifier analysis:

0: (b7) r2 = 685349
1: (63) *(u32 *)(r10 -8) = r2
2: (18) r2 = 0x2065746f6d657220
4: (7b) *(u64 *)(r10 -16) = r2
5: (18) r2 = 0x7525206c61636f6c
7: (7b) *(u64 *)(r10 -24) = r2
8: (18) r2 = 0x203a74656b636170
10: (7b) *(u64 *)(r10 -32) = r2
11: (61) r4 = *(u32 *)(r1 +92)
invalid bpf_context access off=92 size=4
processed 9 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0

Error fetching program/map!

But if i don't call bpf_trace_printk to dump member, it can be loaded.

My question is why the error is caused by calling bpf_trace_printk?


Solution

  • The error is not caused by bpf_trace_prink(), but by the skb accesses that are present in your bytecode only when you call bpf_trace_printk().

    Accessing skb->local_ip4 and skb->remote_ip4 is not allowed for programs of types BPF_PROG_TYPE_LWT_OUT, that you use.

    See kernel code: The function that checks for valid access for this type returns false for certain offsets or range in skb:

    case bpf_ctx_range_till(struct __sk_buff, family, local_port):
    [...]
            return false;
    

    This corresponds to the range where local_ip4 and remote_ip4 are defined:

    struct __sk_buff {
        [...]
    
        /* Accessed by BPF_PROG_TYPE_sk_skb types from here to ... */
        __u32 family;
        __u32 remote_ip4;   /* Stored in network byte order */
        __u32 local_ip4;    /* Stored in network byte order */
        __u32 remote_ip6[4];    /* Stored in network byte order */
        __u32 local_ip6[4]; /* Stored in network byte order */
        __u32 remote_port;  /* Stored in network byte order */
        __u32 local_port;   /* stored in host byte order */
        /* ... here. */
    

    When you remove your call to the bpf_trace_printk() helper, your local variables are no longer needed and clang compiles your code out of the program. The attempt to read at forbidden offsets is no longer part of your bytecode, so the program loads successfully.