Hypothetical scenario: A udp packet stream arrives at machine X, which is running two programs - one which is listening for the packets with recv(), and another which is running pcap.
In this case, as I understand it, the packets are stored in the interface until it is polled by the kernal, which then moves them into a buffer in the kernals memory, and copies the packets into another two buffers - one buffer for the program listening with recv, and one buffer for the program listening with pcap. The packets are removed from the respective buffer when they are read - either by pcap_next() or recv(), the next time the process scheduler runs them (I assume they are blocking in this case). Is this correct? Are there really 4 buffers used, or is it handled some other way?
I'm looking for a description, as detailed as possible, as to what buffers are really involved in this case, and how packets move from one to the other (e.g. does a packet get copied to pcaps buffer before it goes to the recv buffer, after, or undefined?).
I know this seems like a big question, but all I really care about is where the packet gets stored, and how long it stays there for. Bullet points are fine. Ideally I'd like a general answer, but if it varies between OS I'm most interested in Linux.
Linux case (BSD's are probably somewhat similar, using mbuf
s instead of skbuff
s):
Linux uses skbuffs (socket buffers) to buffer network data. A skbuff has metadata about some network data, and some pointers to that data.
Taps (pcap users) create clones of skbuffs. A clone is a new skbuff, but it points to the same data. When someone needs to modify data shared by several skbuffs (the original skbuff and its clones), it first needs to create a fresh copy (copy-on-write).
When someone doesn't need an skbuff anymore, it kfree_skb()
's it. kfree_skb()
decrements a reference count, and when that reference count reaches zero, the skbuff is freed. It's slightly more complicated to account for clones, but this is the general idea.