Search code examples
c#randombirthday-paradox

How random is Random.Next()?


I have been doing some testing on the Random class and I have used the following code:

while (x++ <= 5000000)
{
    y = rnd.Next(1, 5000000);
    if (!data.Contains(y))
        data.Add(y);
    else
    {
        Console.WriteLine("Cycle {2}: Repetation found for number {0} after {1} iteration", y, x, i);
        break;
    }
}

I kept changing the rnd max limit (i.e. 5000000) and I changed the number of iterations and I got the following result:

1) if y = rnd.Next(1, 5000) : The average is between 80 to 110 iterations
2) if y = rnd.Next(1, 5000000) : The average is between 2000 to 4000 iterations
3) if y = rnd.Next(1, int.MaxValue) : The average is between 40,000 to 80,000 iterations.

Why am I getting these averages, i.e. out of 10 times I checked for each value, 80% of the time I get within this average range. I dont think we can call it near to being Random.

What can I do to get a fairly random number.


Solution

  • You are not testing for cycles. You are testing for how long it takes to get a random number you've had before. That's completely different. Your figures are spot on for testing how long it takes to get a random number you had before. Look in wikipedia under "the birthday paradox" for a chart of the probability of getting a collision after a certain number of iterations.

    Coincidentally, last week I wrote a blog article about this exact subject. It'll go live on March 22nd; see my blog then for details.

    If what you want to test for is the cycle length of a pseudo-random number generator then you need to look for not a number you've had before, but rather, a lengthy exact sequence of numbers that you've had before. There are a number of interesting ways to do that, but it's probably easier for me to just tell you: the cycle length of Random is a few billion, so you are unlikely to be able to write a program that discovers that fact. You'd have to store a lot of numbers.

    However, cycle length is not the only measure of quality of a pseudo-random number generator. Remember, PRNGs are not random, they are predictable, and therefore you have to think very carefully about what your metric for "randomness" is.

    Give us more details: why do you care how "random" Random is? What application are you using it for that you care? What aspects of randomness are important to you?