When I do simulation in R I often write the code such that there is a one to one mapping with a seed and a simulation run. Rather than specifying number of repetitions within a seed.
set.seed(1)
run_simulation()
set.seed(2)
run_simulation()
Compared to
set_seed(1)
run_simulation_ntimes(n)
Can it happen where the random state found in R using .Random.seed
be the same for different seeds or have overlap such that random results would be identical to some degree?
For example hypothetically:
set.seed(1)
random_number(3)
# .341 .276 .58
set.seed(2)
random_number(3)
# .276 .58 .68
In this hypothetical .276 and .58 are identical random numbers from the same states between the two seeds.
I understand that two different random states can produce the same random number. Can two different seeds produce the same random states at least partially?
It is unlikely that different s
values for set.seed(s)
will produce the same random state, but that is not the only possible problem with the scheme you are using.
If you call runif(n)
, then the n
values you receive will appear to be independent under many tests. However, if you put runif(1)
in a loop and generate the n
values with sequential seeds, there is no reason to believe the n
values you get will have a good approximation to independence.
This is important, because many uses of n
simulated values implicitly assume they will be independent. For example, if you want a confidence interval for the mean of the simulated value, the usual CI calculation assumes independence.
I would guess that most simulations will be fine, but I'd also guess that some won't be, and I doubt if you will have any way to know if yours is okay or not. So I wouldn't do that.
If you really want reproducibility of each individual simulation n
, an easy but slow approach is to set the seed at the start, then run the full simulation n-1
times, ignoring the results, followed by the one that interests you. You can speed this up by saving the random seed state periodically, e.g. every 100 simulations, store .Random.seed
to a file. This may use a lot of file space because the state takes a bit more than 2500 bytes, but saves time in that you only need to throw away fewer simulations to get to the one you want.