I am interested in a function rand(x, y, seed)
that returns (pseudo) random numbers based on its arguments, with the following properties:
The value returned should depend on its 3 arguments, and not depend on the amount of times rand
was called so far. For example, assuming these calls, in this order:
rand(0, 0, 123) = 1 rand(0, 1, 123) = 2 rand(0, 2, 123) = 3
Then calling rand
with the same arguments, but in a different order, we should get the same values. For example:
rand(0, 1, 123) = 2 rand(0, 2, 123) = 3 rand(0, 0, 123) = 1
The function should have the usual properties of a good (decent, I don't really need anything very fancy) PRNG: large period, uniform distribution etc. Returning positive integers that fit in a signed int is fine. It can also go higher if you want.
If it helps, my seeds will always be the unix timestamp in milliseconds (can be in seconds too if that makes it easier somehow). All arguments can go as high as 32 bit signed ints, but working with 64 bit values inside the function is not a problem.
What function could I use for this?
What I thought of:
Perlin noise seems to do some of what I want, but I have no idea how suitable it really is as a PRNG, especially distribution-wise. I'm also not sure how efficient it is, since my (x, y)
parameters will be rather random, and I cannot precompute it for all of them.
I also looked into the following function:
p = 1400328593
rand(x, y, seed) = (x * x * seed + y * seed * seed + seed * x * y + seed) mod p
= (seed * (x * x + y * seed + x * y + 1)) mod p
This seems to generate good-enough numbers. Based on my (very weak) tests, they also seem to be distributed very well. Testing the period is harder though, I haven't done that.
Update:
Here is the output of Ent for the above function, with time(NULL)
in C as its seed and values generated for (x, y) in {0 ... 999} x {0 ... 999}
:
Entropy = 3.312850 bits per byte.
Optimum compression would reduce the size of this 9207076 byte file by 58 percent.
Chi square distribution for 9207076 samples is 229710872.43, and randomly would exceed this value less than 0.01 percent of the times.
Arithmetic mean value of data bytes is 52.3354 (127.5 = random). Monte Carlo value for Pi is 4.000000000 (error 27.32 percent). Serial correlation coefficient is 0.036131 (totally uncorrelated = 0.0).
Is this good enough in practice (in theory, the above tests suggest that it's not good at all), or is there something well-known that I should be using?
It sounds like you want a hash function. Pick a secure one such as SHA1 if it's not too inefficient, since it's guaranteed to have good distribution characteristics; otherwise, you can use a common hash function such as FNV. Simply use your seed and coordinates as the input data, and the hash as the random value.