Search code examples
rrandomsampling

Sampling from a given probability distribution using R


Given the probability distribution as follows: enter image description here

x-coordinate represents hours, y-coordinate means the probability for each hour.

The problem is how to generate a set of 1000 random data that follows the probability distribution?


Solution

  • The important function is sample. You can specify an extra argument prob to sample which specifies the probabilities for each element. For example,

    sample(1:22,1000,replace=TRUE,prob=c(
      0,1,0,3,7,14,30,24,5,3,3,2,4,3,1,2,3,2,2,2,1,0
    )
    

    (replace that string of numbers with the heights of your bars). The prob argument doesn't have to sum to one, R will renormalise it for you.

    R may generate a warning that it is using "Walker's Alias method" and the results are not comparable to old versions of R. This is normal, and nothing to worry about.