Search code examples
c#hashsetgethashcode

Hash function for indexed objects


Say, I have a class which indexes all objects that are created from it from 0, ..., n-1 (using a static counter of created objects). As these objects are used in HashSets and Dictionaries, we need a Hash function.

Is there any reason not to use this index as Hash value?


Solution

  • Here is the actual code for Contains on a HashSet

    private int[] m_buckets;
    private Slot[] m_slots;
    
    public bool Contains(T item) {
        if (m_buckets != null) {
            int hashCode = InternalGetHashCode(item);
            // see note at "HashSet" level describing why "- 1" appears in for loop
            for (int i = m_buckets[hashCode % m_buckets.Length] - 1; i >= 0; i = m_slots[i].next) {
                if (m_slots[i].hashCode == hashCode && m_comparer.Equals(m_slots[i].value, item)) {
                    return true;
                }
            }
        }
        // either m_buckets is null or wasn't found
        return false;
    }
    
    private int InternalGetHashCode(T item) {
        if (item == null) {
            return 0;
        } 
        return m_comparer.GetHashCode(item) & Lower31BitMask;
    }
    
    internal struct Slot {
        internal int hashCode;      // Lower 31 bits of hash code, -1 if unused
        internal T value;
        internal int next;          // Index of next entry, -1 if last
    }
    

    The key things you want to notice is it calls GetHashCode() then it does hashCode % m_buckets.Length on the result to figure out which singularly linked list root stored in m_slots should it traverse.

    The best possible algorithm will give you a even distribution of values across hashCode % m_buckets.Length so all linked lists will be the same length. Starting at 0 and counting up does this perfectly, so yes if you can get a fixed index for a object that is unique and just counts up that is a perfect hashcode.