Search code examples
c++listhashtable

hashtable: how to understand this implementation of hashtable based on List in STL


I'm learning to build a hashtable with c++. And find this post: https://www.geeksforgeeks.org/c-program-hashing-chaining/.

It implemented a simple and basic version of hashtable (not production level) with chaining to fix hash collision issue.

I followed the post and run it locally and it works as expected. The implementation is as following:

#include <iostream>
#include <list>

using namespace std;

class Hash {
    int BUCKET;
    list<int> *table; // confusing point1 

    public:
        Hash(int V);

        void insertItem(int key);

        void deleteItem(int key);

        int hashFunction(int x) {
            return (x % BUCKET);
        }

        void displayHash();
};

Hash::Hash(int b) {
    this->BUCKET = b;
    table = new list<int>[BUCKET]; // confusing point2
}

void Hash::insertItem(int key) {
    int index = hashFunction(key);
    table[index].push_back(key);
}

void Hash::deleteItem(int key) {
    int index = hashFunction(key);
    list <int> :: iterator i;
    for (i = table[index].begin(); i != table[index].end(); i++) {
        if (*i ==  key) {
            break;
        }
    }

    if (i != table[index].end()) {
        table[index].erase(i);
    }
}

void Hash:: displayHash() {
    for (int i = 0; i < BUCKET; i++) {
        cout << i;
        for (auto x : table[i]) {
            cout << "-->" << x;
        }
        cout << endl;
    }
}

// Driver program  
int main() 
{ 
  // array that contains keys to be mapped 
  int a[] = {15, 11, 27, 8, 12}; 
  int n = sizeof(a)/sizeof(a[0]); 

  // insert the keys into the hash table 
  Hash h(7);   // 7 is count of buckets in 
               // hash table 
  for (int i = 0; i < n; i++)  
    h.insertItem(a[i]);   

  // delete 12 from hash table 
  h.deleteItem(12); 

  // display the Hash table 
  h.displayHash(); 

  return 0; 
}  

I have two confusing points about this implementation:

  • list<int> *table : table should be buckets array. Right? list<int> * should be list type pointer, right? How it works here?

  • table = new list<int>[BUCKET]: I checked many list related
    documents. but didn't find how the [] works?


Solution

  • list<int> *table : table should be buckets array. Right? list<int>* should be list type pointer, right? How it works here?

    In this awful code, table is a pointer to a list<int>, but when you have a pointer to an item and happen to know there's an array of contiguous elements there, you can index it like an array, so table[0] is the same as *table, table[1] would be the next list<int> in memory after table[0] and so on.

    table = new list<int>[BUCKET]: I checked many list related documents. but didn't find how the [] works?

    This is the initialisation that creates an array of list<int> objects and saves their address in table, so we do indeed "happen to know" that the array is there and can index table as an array. For example, inside the displayHash function you see for (auto x : table[i]) - that means x takes on each value from the list<int> at bucket i which is table[i].

    The code also needs a destructor to delete[] table, or all the memory will be leaked when the Hash object's default destructor runs without doing any clean-up.

    You should also be aware that it lets you insert multiple copies of the same key - a proper and full implementation of this functionality is be std::unordered_multiset.

    Cleaning it up minimally - without taking the next steps to use templates to let you use it for other types, add iterators etc.:

    class Hash {
        vector<list<int>> table_;
    
      public:
        Hash(size_t size) : table_{size} { }
    
        void insert(int key) {
            table_[bucket(key)].push_back(key);
        }
    
        void erase(int key) {
            auto& bucket_list = table_[bucket(key)];
            auto it = find(bucket_list.begin(), bucket_list.end(), key);
            if (it != bucket_list.end())
                bucket_list.erase(it);
        }
    
        int bucket(int key) const {
            return hash(key) % table_.size();
        }
    
        static int hash(int key) {
            return key;
        }
    
        // example usage: my_hash.display(std::cout);
        void display(std::ostream& os) const {
            for (size_t i = 0; i < table_.size_; ++i) {
                os << '[' << i << ']';
                for (auto x : table[i])
                     os << "-->" << x;
                os << '\n';
            }
        }
    
        // extensions ===================================
        bool contains(int key) const {
            auto& bucket_list = table_[bucket(key)];
            auto it = find(bucket_list.begin(), bucket_list.end(), key);
            return it != bucket_list.end();
        }
    
        // example usage:
        //   my_hash.visit([](auto key) { std::cout << key << '\n'; });
        template <typename Functor)
        void visit(Functor functor) const {
            for (size_t i = 0; i < table_.size_; ++i)
                for (auto x : table[i])
                     functor(x);
        }
    };