Which part of this code ensures the result is in the range 0.,.m-1?

I am new to C and am confused by this hashing code:

#include <stdint.h>
// defines uint32/64_t as unsigned 32/64-bit integer.
uint32_t hash(uint32_t x, uint32_t m, uint64_t a, uint64_t b) {
// hashes x strongly universally into the range [m]
// using the random seeds a and b.
return (((a*x+b)>>32)*m)>>32;
}

The result is supposed to be in the range 0...m-1 but I don't see how that is ensured.

x and m are 32 bit integers while a and b are uniformly random 64 bit integers. (a*x+b)>>32 only uses the 32 least significant bits, so multiplied with the 32-bit integer m, we get the exact product in 64 bits with no overflow. I still don't see why the result is in the range 0..m-1 though.

Solution

Briefly

The type of a*x is uint64_t, so it produces at most a 64-bit result (barring an exotic C implementation with int wider than 64 bits). Then >>32 removes 32 bits, so the result is less than 2³². Then multiplying by m produces something less than 2³²•m. Then applying >>32 to that produces something less than 2³²•m/2³² = m.

So the result is less than m.

Fuller

The type of a is uint64_t and the type of x is uint32_t. In a*x, x is converted to uint64_t, and the multiplication is performed with uint64_t.

The type uint64_t can represent values in [0, 2⁶⁴), so the result is in that interval. (“[a, b)” means the interval includes the endpoint a and does not include the endpoint b.) If the mathematical result of the multiplication is outside that interval, the computed result equals the mathematical result reduced by subtracting some integer multiple of 2⁶⁴—higher bits are discarded.

Then b is added to this. The type of b is uint64_, so the result is again in [0, 2⁶⁴).

Then >>32 is applied. This is defined to produce the result of dividing by 2³² and discarding the fractional part. So the result must be in [0, 2³²). Equivalently, you can think of it as discarding 32 bits from a 64-bit number (possibly with leading 0 bits), yielding a 32-bit number.

Then we multiply by m. Multiplying a number in [0, 2³²) by m produces a number in [0•m, 2³²•m). (This result fits in a uint64_t because the type of m is uint32_t, so we are multiplying two 32-bit numbers.)

Finally, >>32 is applied again. The result is some number in [0•m, 2³²•m) divided by 2³², with the fractional part discarded. So the result is in [0•m/2³², 2³²•m/2³²) = [0, m).

Therefore, the result is in [0, m). Since the fractional part is discarded, it must be in [0, m−1].