Search code examples
calgorithmstatisticsintegeruniform

Generating a uniform distribution of INTEGERS in C


I've written a C function that I think selects integers from a uniform distribution with range [rangeLow, rangeHigh], inclusive. This isn't homework--I'm just using this in some embedded systems tinkering that I'm doing for fun.

In my test cases, this code appears to produce an appropriate distribution. I'm not feeling fully confident that the implementation is correct, though. Could someone do a sanity check and let me know if I've done anything wrong here?

//uniform_distribution returns an INTEGER in [rangeLow, rangeHigh], inclusive.
int uniform_distribution(int rangeLow, int rangeHigh)
{
    int myRand = (int)rand(); 
    int range = rangeHigh - rangeLow + 1; //+1 makes it [rangeLow, rangeHigh], inclusive.
    int myRand_scaled = (myRand % range) + rangeLow;
    return myRand_scaled;
}
//note: make sure rand() was already initialized using srand()

P.S. I searched for other questions like this. However, it was hard to filter out the small subset of questions that discuss random integers instead of random floating-point numbers.


Solution

  • On some implementations, rand() did not provide good randomness on its lower order bits, so the modulus operator would not provide very random results. If you find that to be the case, you could try this instead:

    int uniform_distribution(int rangeLow, int rangeHigh) {
        double myRand = rand()/(1.0 + RAND_MAX); 
        int range = rangeHigh - rangeLow + 1;
        int myRand_scaled = (myRand * range) + rangeLow;
        return myRand_scaled;
    }
    

    Using rand() this way will produce a bias as noted by Lior. But, the technique is fine if you can find a uniform number generator to calculate myRand. One possible candidate would be drand48(). This will greatly reduce the amount of bias to something that would be very difficult to detect.

    However, if you need something cryptographically secure, you should use an algorithm outlined in Lior's answer, assuming your rand() is itself cryptographically secure (the default one is probably not, so you would need to find one). Below is a simplified implementation of what Lior described. Instead of counting bits, we assume the range falls within RAND_MAX, and compute a suitable multiple. Worst case, the algorithm ends up calling the random number generator twice on average per request for a number in the range.

    int uniform_distribution_secure(int rangeLow, int rangeHigh) {
        int range = rangeHigh - rangeLow + 1;
        int secureMax = RAND_MAX - RAND_MAX % range;
        int x;
        do x = secure_rand(); while (x >= secureMax);
        return rangeLow + x % range;
    }