Search code examples
cdata-structureshashtable

No empty entries in hash table using linear probing?


I am trying to understand and implement hash tables using linear probing. While searching for an implementation I have come across this piece of code for a search method:

struct DataItem *search(int key) {
   //get the hash 
   int hashIndex = hashCode(key);

   //move in array until an empty 
   while(hashArray[hashIndex] != NULL) {

      if(hashArray[hashIndex]->key == key)
         return hashArray[hashIndex];

      //go to next cell
      ++hashIndex;

      //wrap around the table
      hashIndex %= SIZE;
   }

   return NULL;        
}

What I don't understand is if all the entries in the hashArray are already occupied by a key, shouldn't the while loop become infinite? or does this never happen? Is there something I am missing?


Solution

  • As you said, this implementation will get stuck in a loop if all the cells are already full and key does not exist in the table. Here is the implementation that should work:

    struct DataItem *search(int key) {
       //get the hash 
       int initialHashIndex = hashCode(key) % SIZE;
    
       //move in array until an empty 
       int hashIndex = initialHashIndex;
       while(hashArray[hashIndex] != NULL) {    
          if(hashArray[hashIndex]->key == key)
             return hashArray[hashIndex];
    
          //go to next cell
          ++hashIndex;
    
          //wrap around the table
          hashIndex %= SIZE;
    
          //check if all the items are checked
          if (hashIndex == initialHashIndex)
              break;
       }
    
       return NULL;        
    }
    

    However, strategies should be implemented to avoid letting a hash table's occupancy level to exceed a certain threshold. Keep in mind that the main purpose of a hash table is to provide constant average (amortized) operations time, independent of the number of elements stored in the hash. Therefore, if the hash table occupancy is high in a Linear Probing Hash, the search time will be a function of stored elements, which annihilates the main goal of using a hash.