Search code examples
rsimulationsapply

generate multinomial random varibles with varying sample size in R


I need to genereate multinomial random variables with varying sample size.

Let say i already generated my sample sizes as follows,

samplesize =c(50,45,40,48)

then i need to generate multinomial random variables based on this varying sample size. I tried this using a for loop and using a apply function(sapply).

Using For loop ,

p1=c(0.4,0.3,0.3)
for( i in 1:4)
{
xx1[i]=rmultinom(4, samplesize[i], p1)
} 

If my code is correct then i should get a matrix that have 4 columns and 3 rows. Where column totals should equal to the each value in sample sizes. But i am not getting that.

Using Sapply ,

sapply( samplesize ,function(x)
{
  rmultinom(10, samplesize[x], p1)
})

I am getting an error here also.

So can any one help me to figure out what went wrong ?

Thank you


Solution

  • samplesize <- c(50, 45, 40, 48)
    p <- c(0.4, 0.3, 0.3)
    
    ## method 1
    set.seed(0)
    xx1 <- matrix(0, length(p), length(samplesize))
    for(i in 1:length(samplesize)) {
      xx1[, i] <- rmultinom(1, samplesize[i], p)
      }
    xx1
    #     [,1] [,2] [,3] [,4]
    #[1,]   24   17   20   24
    #[2,]   11   14    8   16
    #[3,]   15   14   12    8
    colSums(xx1)
    #[1] 50 45 40 48
    
    ## method 2
    set.seed(0)
    xx2 <- sapply(samplesize, rmultinom, n = 1, prob = p)
    xx2
    #     [,1] [,2] [,3] [,4]
    #[1,]   24   17   20   24
    #[2,]   11   14    8   16
    #[3,]   15   14   12    8
    colSums(xx2)
    #[1] 50 45 40 48
    

    Note: rmultinom is not "vectorized" like other distribution functions say rnorm.

    set.seed(0)
    fail <- rmultinom(length(samplesize), samplesize, p)
    #     [,1] [,2] [,3] [,4]
    #[1,]   24   19   25   24
    #[2,]   11   16   10   17
    #[3,]   15   15   15    9
    colSums(fail)
    #[1] 50 50 50 50
    

    So the R-level for loop or sapply loop or using sugar function Vectorize is necessary.