I am trying to simulate two exponential distributions. For example two CPUs processing jobs e.g. one having average service time 10 min (lambda = 0.1) and another one 20 min (lambda = 0.05) and they work independently. Both of them are busy when a new job arrives.
I would like to simulate the waiting time of a new job
Here is what I did so far.
cpu1 = rexp(n = 10000, rate = .1)
cpu2 = rexp(n = 10000, rate = .05)
I generate 10K data points based on exponential distribution. For each of them new job has to wait min(cpu1[i], cpu2[i])
I store all of them in a data frame and compute the mean.
for (i in seq(1, 10000)) {
if (i == 1) {
df1 <- data.frame(waiting_time=min(cpu1[i], cpu2[i]))
} else {
df1 <- rbind(df1, data.frame(waiting_time=data.frame(waiting_time=min(cpu1[i], cpu2[i])))
}
}
mean(df1$waiting_time)
Is this the right way to do the simulation? or am I doing something wrong?
As has been pointed out, mean(pmin(cpu1,cpu2))
is equivalent to the for
loop and mean(df1$waiting_time)
, but much, much faster.
Or you could skip the simulation altogether since the minimum of two independent random exponential variables is also exponentially distributed with a rate equal to the sum of the two rates. Furthermore, the sum of n
iid exponential random variables is gamma-distributed with the same rate parameter and a shape parameter equal to n
.
So we can simply do rgamma(1, 1e4, 0.15)/1e4
or, equivalently, rgamma(1, 1e4, 0.15*1e4)
instead of mean(pmin(cpu1,cpu2))
, and the results will have identical distributions.