Search code examples
pythonhashhashtable

Maximum/minimum value returned by Python's hash() function


Context: building a consistent hashing algorithm.

The official documentation for Python's hash() function states:

Return the hash value of the object (if it has one). Hash values are integers.

However, it does not explicitly state whether the function maps to an integer range (with a minimum and a maximum) or not.

Coming from other languages where values for primitive types are bounded (e.g. C#'s/Java's Int.MaxValue), I know that Python's likes to think in "unbounded" terms – i.e. switching from int to long in the background.

Am I to assume that the hash() function also is unbounded? Or is it bounded, for example mapping to what Python assigns to the max/min values of the "int-proper" – i.e. between -2147483648 through 2147483647?


Solution

  • As others pointed out, there is a misplaced[1] Note in the documentation that reads:

    hash() truncates the value returned from an object’s custom hash() method to the size of a Py_ssize_t.

    To answer the question, we need to get this Py_ssize_t. After some research, it seems that it is stored in sys.maxsize, although I'd appreciate some feedback here.

    The solution that I adopted eventually was then:

    import sys
    bits = sys.hash_info.width              # in my case, 64
    print (sys.maxsize)                     # in my case, 9223372036854775807
    
    # Therefore:
    hash_maxValue = int((2**bits)/2) - 1    # 9223372036854775807, or +sys.maxsize
    hash_minValue = -hash_maxValue          # -9223372036854775807, or -sys.maxsize
    

    Happy to receive comments/feedbacks on this – until proven wrong, this is the accepted answer.


    [1] The note is included in the section dedicated to __hash__() instead of the one dedicated to hash().