Search code examples
ebpfbpfxdp-bpfxdp-ebpflibbpf

Thread safe operations on XDP


I was able to confirm from the documentation that bpf_map_update_elem is an atomic operation if done on HASH_MAPs. Source (https://man7.org/linux/man-pages/man2/bpf.2.html). [Cite: map_update_elem() replaces existing elements atomically]

My question is 2 folds.

  1. What if the element does not exist, is the map_update_elem still atomic?

  2. Is the XDP operation bpf_map_delete_elem thread safe from User space program?

The map is a HASH_MAP.


Solution

  • Atomic ops, race conditions and thread safety are sort of complex in eBPF, so I will make a broad answer since it is hard to judge from your question what your goals are.

    Yes, both the bpf_map_update_elem command via the syscall and the helper function update the maps 'atmomically', which in this case means that if we go from value 'A' to value 'B' that the program always sees either 'A' or 'B' not some combination of the two(first bytes of 'B' and last bytes of 'A' for example). This is true for all map types. This holds true for all map modifying syscall commands(including bpf_map_delete_elem).

    This however doesn't make race conditions impossible since the value of the map may have changed between a map_lookup_elem and the moment you update it.

    What is also good to keep in mind is that the map_lookup_elem syscall command(userspace) works differently from the helper function(kernelspace). The syscall will always return a copy of the data which isn't mutable. But the helper function will return a pointer to the location in kernel memory where the map value is stored, and you can directly update the map value this way without using the map_update_elem helper. That is why you often see hash maps used like:

    value = bpf_map_lookup_elem(&hash_map, &key);
    if (value) {
        __sync_fetch_and_add(&value->packets, 1);
        __sync_fetch_and_add(&value->bytes, skb->len);
    } else {
        struct pair val = {1, skb->len};
    
        bpf_map_update_elem(&hash_map, &key, &val, BPF_ANY);
    }
    

    Note that in this example, __sync_fetch_and_add is used to update parts of the map value. We need to do this since updating it like value->packets++; or value->packets += 1 would result in a race condition. The __sync_fetch_and_add emits a atomic CPU instruction which in this case fetches, adds and writes back all in one instruction.

    Also, in this example, the two struct fields are atomically updated, but not together, it is still possible for the packets to have incremented but bytes not yet. If you want to avoid this you need to use a spinlock(using the bpf_spin_lock and bpf_spin_unlock helpers).

    Another way to sidestep the issue entirely is to use the _PER_CPU variants of maps, where you trade-off congestion/speed and memory use.