Libev uses three data structures to storage different watchers.
Heap: for watchers that sorted by time, such as ev_timer
and ev_periodic
.
Linked list: such as ev_io
, ev_signal
, ev_child
and etc.
Array: such as ev_prepare
, ev_check
, ev_async
and etc.
There is no doubt about that uses heap to store timer watcher. But what is the criteria of selecting linked list and array?
The data structure that stores ev_io watchers seems a little complex. It first is an array that with fd
as its index and the element in the array is a linked list of ev_io watcher
. It is more convenient to allocate space for array if use linked list as element. Is it the reason?
Or just because of the insert or remove operation of ev_io
is more frequently and the ev_prepare
seems more stable?
Or any other reasons?
The expectation is that there are only a few (usually one, and almost always at most two) io watchers for the same fd (similarly for signals). Putting the list links into the watcher means no extra allocations are required, as would be required if an array was used per watcher. If a lot of I/O watchers were active on the same fd, then this linked list approach would be slower.
Arrays are used because insertion and removal is very fast (the watcher stores the array index). Using a 4-byte index also reduces memory requirements on 64 bit machines (12 bytes per watcher as opposed to e.g. 16 for a doubly-linked list), and using an array of pointers means that all pointers are near each other in memory, which improves cache efficiency when scanning, which is a frequent operation for some watchers.
Cache efficiency is also the reason why a 4-heap is used over a 2-heap, and the reason why the time values are copied to the heap data structure, to avoid having to access the watcher memory on heap operations.
Child watchers actually use a fixed-size hash table, and again the expectation is that the number of child watchers per hash bucket is small, so the list pointers become part of the watcher data structure.