Search code examples
bpfebpf

ebpf: BPF_FUNC_map_lookup_elem calling convention


Looking at kernel's sample/bpf/sock_example.c:

struct bpf_insn prog[] = {
                BPF_MOV64_REG(BPF_REG_6, BPF_REG_1),
                BPF_LD_ABS(BPF_B, ETH_HLEN + offsetof(struct iphdr, protocol) /* R0 = ip->proto */),
                BPF_STX_MEM(BPF_W, BPF_REG_10, BPF_REG_0, -4), /* *(u32 *)(fp - 4) = r0 */
                BPF_MOV64_REG(BPF_REG_2, BPF_REG_10),
                BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -4), /* r2 = fp - 4 */
                BPF_LD_MAP_FD(BPF_REG_1, map_fd),
                BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
                BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 2),
                BPF_MOV64_IMM(BPF_REG_1, 1), /* r1 = 1 */
                BPF_ATOMIC_OP(BPF_DW, BPF_ADD, BPF_REG_0, BPF_REG_1, 0),
                BPF_MOV64_IMM(BPF_REG_0, 0), /* r0 = 0 */
                BPF_EXIT_INSN(),
        };

I understand that eBPF sets registers r1-r5 to hold arguments to BPF helpers. What I don't understand is why to pass a map fd to BPF_FUNC_map_lookup_elem? According to helpers code :

const struct bpf_func_proto bpf_map_lookup_elem_proto = {
    .func       = bpf_map_lookup_elem,
    .gpl_only   = false,
    .pkt_access = true,
    .ret_type   = RET_PTR_TO_MAP_VALUE_OR_NULL,
    .arg1_type  = ARG_CONST_MAP_PTR,
    .arg2_type  = ARG_PTR_TO_MAP_KEY,
};

which means both arguments are pointers, and none is the map fd. Unless, I'm looking in the wrong code?


Solution

  • File descriptors when writing your program in user space, but later replaced by pointers to the map by the verifier.

    File Descriptors When Writing Your eBPF Program

    You write your eBPF program in user space, where you don't have any address pointer to the map. So you use a file descriptor for referencing that map for the various operations (lookups, updates, deletes) that your program may run.

    If writing your program in C, instead of assembly instructions like you do, this is usually abstracted: The program references the map with a C pointer, but the loader (typically relying on libbpf) performs some relocation step to extract metadata about the map from a dedicated ELF section of the object file, retrieves the file descriptor to the map, and inserts it in the relevant bytecode instructions.

    Kernel Verifier Switches to Pointers

    But you are correct: in the kernel, the BPF_FUNC_map_lookup_elem() helper and the like use pointers to the maps, not file descriptors. This is at load time, during verification of the program, that the verifier replaces the file descriptors by the pointers to the memory area associated to the maps (see resolve_pseudo_ldimm64() from kernel/bpf/verifier.c). It is possible to get a pointer at this time: The verifier does have access to the kernel-memory pointers for those maps.

    Note that the verifier actually goes even further and, for some map types (hash, arrays), it even replaces the calls to the helpers for map lookups completely, using instead instructions to directly read from the relevant addresses in the map (search for map_gen_lookup for details).