Search code examples
c++data-structureshashtable

Inserting node in Hash table with open addressing [Optimizing the logic]


I am trying to understand a data structure, hash table with open addressing.
I am currently reading on the source code provided by geekforgeeks, but I have a few questions on the code.

Below, is the pasted function for inserting Node from geekforgeeks.

//Function to add key value pair 
    void insertNode(K key, V value) 
    { 
        HashNode<K,V> *temp = new HashNode<K,V>(key, value); 

        // Apply hash function to find index for given key 
        int hashIndex = hashCode(key); 

        //find next free space  
        while(arr[hashIndex] != NULL && arr[hashIndex]->key != key  //// LINE 9 //////
               && arr[hashIndex]->key != -1) 
        { 
            hashIndex++; 
            hashIndex %= capacity; 
        } 

        //if new node to be inserted increase the current size 
        if(arr[hashIndex] == NULL || arr[hashIndex]->key == -1)    //// LINE 17 //////
            size++; 
        arr[hashIndex] = temp; 
    } 

Questions

  1. In line 9, why would you check three conditionals, being,

    • if slot inside the hash table is null ===> arr[hashIndex] != NULL
    • AND if slot has the same key with the node that is going to be inserted ===> arr[hashIndex]->key != key
    • AND if slot has the key of -1, which indicates the slot where node was deleted before ===> arr[hashIndex]->key != -1

      If I were to optimize this code, I believe checking whether the slot is NULL or not is already enough.
  2. In line 17, why would you increment the size property of HashMap before assigning the node to the slot? ===> if(arr[hashIndex] == NULL || arr[hashIndex]->key == -1) size++;
    To me, this logic seems to be messy.
    I would rather do, arr[hashIndex] = temp; size++;

With the assumption of geekforgeeks's logic is well written, could you explain to me why the logic for inserting the new node to a hash table with open addressing is implemented as above specifically on the two points I have raised?


Solution

  • The three conditions to have a valid index are:

    1. The object at the index is NULL
    2. OR the object is not NULL, but its key is the same of the one we're inserting
    3. OR the object is not NULL, but its key value is -1

    Since the negation of all three conditions occurs, we don't have a valid index, and the loop rolls on.

    In line 17: size is incremented only if the insertion doesn't reuse an existing index, so the node is new (which means either condition 1 or 3 applies).