Search code examples
c#dictionaryhashset

Why would I use a HashSet over a Dictionary?


I'm trying to implement a list of cached paths on a A* algorithm. Currently, the cached paths are stored in a list like this:

readonly List<CachedPath> _cachedPaths = new List<CachedPath>();

The operations performed over this list are:

FirstOrDefault to get an element that satisfies certain conditions

var cached = _cachedPaths.FirstOrDefault(p => p.From == from && p.To == target && p.Actor == self);

Remove and element

_cachedPaths.Remove(cached);

Additions

_cachedPaths.Add(new CachedPath {
                    From = from,
                    To = target,
                    Actor = self,
                    Result = pb,
                    Tick = _world.WorldTick
                });

NOTE: The class CachedPath has GetHashCode and Equals overriden by just the From, To and Actor, so two instances that have these same attributes have the same hash and equality.

Given that quick lookups (Contains), insertions and deletions in a 'HashSet' are O(1) (if I'm not mistaken), I considered using a 'HashSet' to do these operations. The only problem is the FirstOrDefault, that I had to enumerate the whole collection to get it.

Given this problem, I considered also using a Dictionary indexed by the hash of From, To and Actor:

Dictionary<int, CachedPath> cachedPath

Once again, if I'm not mistaken, Dictionary also offers O(1) in insertions, deletions, and also retrieval by Key. This leads me to think that a Dictionary is a HashSet + O(1) element retrieval capabilities.

Am I missing something? Is really Dictionary better than HashSet in the sense that it supports more operations?

Thanks in advance.


Solution

  • Dictionary is not better than HashSet, it's just different.

    • You use a HashSet when you want to store an unordered collection of items, and
    • You use a Dictionary when you want to associate a set of items called "keys" with another collection of items called "values"

    One could think of a HashSet as a Dictionary with no associated values (in fact, HashSet is sometimes implemented using a Dictionary behind the scene) but it is not necessary to think about it in this way: thinking of the two as of entirely different things works fine, too.

    In your case you could potentially improve performance by making a dictionary by actor, like this:

    Dictionary<ActorType,List<CachedPath>> _cachedPathsByActor
    

    This way your linear search would quickly choose a sub-list based on an actor, and then search linearly by target:

    var cached = _cachedPathsByActor[self].FirstOrDefault(p => p.From == from && p.To == target);
    

    or by making an equality comparer that considers all three items, and using a Dictionary with CachedPath as both keys and values, and that custom IEqualityComparer<T> as the key comparer:

    class CachedPathEqualityComparer : IEqualityComparer<CachedPath> {
        public bool Equals(CachedPath a, CachedPath b) {
            return a.Actor == b.Actor
                && a.From == b.From
                && a.To == b.To;
        }
        public int GetHashCode(CachedPath p) {
            return 31*31*p.Actor.GetHashCode()+31*p.From.GetHashCode()+p.To.GetHashCode();
        }
    }
    ...
    var _cachedPaths = new Dictionary<CachedPath,CachedPath>(new CachedPathEqualityComparer());
    ...
    CachedPath cached;
    if (_cachedPaths.TryGetValue(self, out cached)) {
        ...
    }
    

    However, this approach assumes that there would be at most one item in the dictionary with identical From, To, and Actor.