Context: building a consistent hashing algorithm.
The official documentation for Python's hash()
function states:
Return the hash value of the object (if it has one). Hash values are integers.
However, it does not explicitly state whether the function maps to an integer range (with a minimum and a maximum) or not.
Coming from other languages where values for primitive types are bounded (e.g. C#'s/Java's Int.MaxValue
), I know that Python's likes to think in "unbounded" terms – i.e. switching from int
to long
in the background.
Am I to assume that the hash()
function also is unbounded? Or is it bounded, for example mapping to what Python assigns to the max/min values of the "int-proper" – i.e. between -2147483648
through 2147483647
?
As others pointed out, there is a misplaced[1] Note in the documentation that reads:
hash() truncates the value returned from an object’s custom hash() method to the size of a Py_ssize_t.
To answer the question, we need to get this Py_ssize_t
. After some research, it seems that it is stored in sys.maxsize
, although I'd appreciate some feedback here.
The solution that I adopted eventually was then:
import sys
bits = sys.hash_info.width # in my case, 64
print (sys.maxsize) # in my case, 9223372036854775807
# Therefore:
hash_maxValue = int((2**bits)/2) - 1 # 9223372036854775807, or +sys.maxsize
hash_minValue = -hash_maxValue # -9223372036854775807, or -sys.maxsize
Happy to receive comments/feedbacks on this – until proven wrong, this is the accepted answer.
[1] The note is included in the section dedicated to __hash__()
instead of the one dedicated to hash()
.