Search code examples
pythonmachine-learninghashfirebase-mlkit

I want to understand the following code lines defined in the function


I am a beginner in python and machine learning. while doing a project from the book "hands-on ML with sci-kit learn and TF" I came across this way of creating test-set using hashlib. Can you please help me understand what this logic of the return statement, step by step-

def test_set_check(identifier, test_ratio, hash):
    return hash(np.int64(identifier)).digest()[-1]<256 * test_ratio

Solution

  • assuming hash is something from hashlib:

    • cast identifier to a (numpy) 64bit integer
    • hash the cast identifier
    • get the value of the last byte from the hash
    • compare that value to (256 * test_ratio)
    • return the result of The comparison