Search code examples
c#hexhashcodefloating-accuracyfloating-point-exceptions

Method to ensure GetHashCode() overload returns the same for semi-equal R3 float vectors


This one is for the binary and primitive experts. I am implementing a float R3 vector struct and my definition for "equality" is actually "mostly equal." Specifically, for all coords of the compared vectors Abs( (a[i] - b[i]) / (a[i] + b[i]) ) < .00001 returns true.

private static bool FloatEquality(float a, float b)
    {
        if (a == b)
        {
            return true;
        }
        else
        {
            float e;
            try
            {
                e = (b - a) / (b + a);
            }
            catch (DivideByZeroException)
            {
                float g = float.Epsilon;
                e = (b - a) / g;
            }
            //AppConsole.AppConsole.Instance.WriteLine(e);
            if (e < .00001f && e > -.00001f)
            {
                return true;
            }
            else
            {
                return false;
            }
        }
    }

My problem is in determining if there's a way to get the hash values to come out the same on vectors that meet this requirement due to the fact that I want to be able to use these vectors as "keys" for a Dictionary.

As you can see, the above code is used to check for equality on 3 different coordinates.

I was thinking of extracting the bytes from the three float coordinates and using the middle two from each.

(the following isn't code but Stack Overflow won't let me post it unless I indent it)

Vector(x,y,z):
x's float byte[] = [ x1 x2 x3 x3 ]
y's float byte[] = [ y1 y2 y3 y4 ]
z's float byte[] = [ z1 z2 z3 z4 ]

Hash code: byte[] {x2^x3 , y2^y3, z2 ^ z3, x2 ^ z3}

Or something like that... In short - I'm curious how to ensure that the hashcodes of vectors which fit my equals method will always come out the same... If someone has a great idea with very low cost computation, I'd love to hear it. Or if you could direct me to a place that discusses more in depth how floats are stored and which bytes will always be the same if the above comparison method returns equal.

I may need a new comparison method rather than a hash function because there's really no way that I can be sure that any of the bytes will match I guess...


Solution

  • Well, the basic idea is simple - you have to artificially reduce the precision of your floats. How to do this efficiently depends a lot on the kind of data you're expecting to see.

    For example, if you're mostly using small values, you could simply use something like this:

    (int)Math.Round(x1 * 1000) 
    ^ (int)Math.Round(x2 * 1000) 
    ^ (int)Math.Round(x3 * 1000)
    

    Note that while I'm not actually fulfilling your if (e < .00001f && e > -.00001f) condition, it doesn't matter - the idea is to reduce the collisions, and ensure that what values that are equal will have equal hash codes. It's not necessary (or possible) to also ensure that values that are not equal will not have equal hash code. The rest should be handled in the overrides of Equals, == etc. - that's where strict equality checks must be present. Unlike Equals and company, GetHashCode() only has data about a single vector, so you don't even have an option of using data from more than that single vector in there.

    Hash codes are only there to make key collisions infrequent. So Dictionary will still work if each of your vectors will return 0 in GetHashCode() - it's just that the performance will suffer. As long as equal vectors end up with equal hash codes, the hash code can be anything that suits your needs :)

    Of course, the best way would simply be not to use vectors as keys in the dictionary. Find the part of the vector that interests you (helps you the most), and use that as a key. Maybe you'll find out Dictionary isn't actually what you want anyway (for example, in a game, there's tons of different space partitioning methods that can be used with vectors - from simple grid-like layouts, through manual space partitioning, up to things like BSP).