Search code examples
arrayshashlinked-listhashmaphash-collision

Why isn't hashmap insertion `O(1)` worst-case when separate chaining is used?


I have learned how a hashmap is implemented: an array of linked lists, if separate chaining is used. I know that when a key-value pair (key, val) is inserted, the hashmap inserts the pair {key, val} in the linked list at entry hash(key) % array.size in the underlying array.

enter image description here

So, for the insertion process, all that is done is

  • A computation of the hash function; also,
  • A modulo operation (take the hash modulo the array size); also,
  • An array access; finally,
  • Allocating a linked list node and inserting it into the linked list.

However, isn't this insertion process O(1), assuming the hash is O(1), because everything is then O(1) except possibly the linked list insertion, and for the linked list insertion, couldn't we always choose to insert the pair at the beginning of the linked list, which is O(1)? Isn't this O(1) always? If so, then why don't hash tables use this strategy?


Solution

  • Typically you don't want duplicate keys in any sort of map, ie when you try to insert a pair (key, value) you need to check whether that key already exists in the map or not. For this check you have to

    1. Find the respective bucket via hash(key)
    2. Check if key already exists in that bucket

    And for checking if the key already exists, you have to iterate the bucket, which can have n elements. Thus, with buckets impelemented as linked list, the worst-case for insert is still O(n)

    There are other implementations, that implement the buckets as balanced trees. With that implementation you can lower the worst-case for insert to O(log n)