Search code examples
c#.netalgorithmdictionarygethashcode

Create Unique Hashcode for the permutation of two Order Ids


I have a collection which is a permutation of two unique orders, where OrderId is unique. Thus it contains the Order1 (Id = 1) and Order2 (Id = 2) as both 12 and 21. Now while processing a routing algorithm, few conditions are checked and while a combination is included in the final result, then its reverse has to be ignored and needn't be considered for processing. Now since the Id is an integer, I have created a following logic:

 private static int GetPairKey(int firstOrderId, int secondOrderId)
        {
            var orderCombinationType = (firstOrderId < secondOrderId)
                ? new {max = secondOrderId, min = firstOrderId}
                : new { max = firstOrderId, min = secondOrderId };

            return (orderCombinationType.min.GetHashCode() ^ orderCombinationType.max.GetHashCode());
        }

In the logic, I create a Dictionary<int,int>, where key is created using the method GetPairKey shown above, where I ensure that out of given combination they are arranged correctly, so that I get the same Hashcode, which can be inserted and checked for an entry in a Dictionary, while its value is dummy and its ignored.

However above logic seems to have a flaw and it doesn't work as expected for all the logic processing, what am I doing wrong in this case, shall I try something different to create a Hashcode. Is something like following code a better choice, please suggest

Tuple.Create(minOrderId,maxOrderId).GetHashCode, following is relevant code usage:

  foreach (var pair in localSavingPairs)
            {
                    var firstOrder = pair.FirstOrder;
                    var secondOrder = pair.SecondOrder;

                   if (processedOrderDictionary.ContainsKey(GetPairKey(firstOrder.Id, secondOrder.Id))) continue;

Adding to the Dictionary, is the following code:

processedOrderDictionary.Add(GetPairKey(firstOrder.Id, secondOrder.Id), 0); here the value 0 is dummy and is not used


Solution

  • You need a value that can uniquely represent every possible value.

    That is different to a hash-code.

    You could uniquely represent each value with a long or with a class or struct that contains all of the appropriate values. Since after a certain total size using long won't work any more, let's look at the other approach, which is more flexible and more extensible:

    public class KeyPair : IEquatable<KeyPair>
    {
      public int Min { get; private set; }
      public int Max { get; private set; }
    
      public KeyPair(int first, int second)
      {
        if (first < second)
        {
          Min = first;
          Max = second;
        }
        else
        {
          Min = second;
          Max = first;
        }
      }
    
      public bool Equals(KeyPair other)
      {
        return other != null && other.Min == Min && other.Max == Max;
      }
    
      public override bool Equals(object other)
      {
        return Equals(other as KeyPair);
      }
    
      public override int GetHashCode()
      {
        return unchecked(Max * 31 + Min);
      }
    }
    

    Now, the GetHashCode() here will not be unique, but the KeyPair itself will be. Ideally the hashcodes will be very different to each other to better distribute these objects, but doing much better than the above depends on information about the actual values that will be seen in practice.

    The dictionary will use that to find the item, but it will also use Equals to pick between those where the hash code is the same.

    (You can experiment with this by having a version for which GetHashCode() always just returns 0. It will have very poor performance because collisions hurt performance and this will always collide, but it will still work).