Search code examples
algorithmdata-structureslinked-listhashtabledynamic-arrays

Using dynamic array to handle collisions in hash tables


Looking around at some of the hash table implementations, separate chaining seems to be handled via a linked list or a tree. Is there a reason why a dynamic array is not used? I would imagine that having a dynamic array would have better cache performance as well. However, since I've not seen any such implementation, I'm probably missing something.

What am I missing?


Solution

  • One advantage of a linked list over a dynamic array is that rehashing can be accomplished more quickly. Rather than having to make a bunch of new dynamic arrays and then copy all the elements from the old dynamic arrays into the new, the elements from the linked lists can be redistributed into the new buckets without performing any allocations.

    Additionally, if the load factor is small, the space overhead of using linked lists may be better than the space overhead for dynamic arrays. When using dynamic arrays, you usually need to store a pointer, a length, and a capacity. This means that if you have an empty dynamic array, you end up needing space for two integers and a pointer, plus any space preallocated to hold the elements. In an empty bucket, this space overhead is large compared to storing just a null pointer for a linked list. On the other hand, if the buckets have large numbers of elements in them, then dynamic arrays will be a bit more space-efficient and have higher performance due to locality of reference.

    Hope this helps!