Search code examples
rsimulationnormal-distribution

How to sample binary data to get a normal distribution of the sum of the rows


I want to create 4 binary variables with a sample of 300 (assume I may want to increase 4 to 10 variables). But when I sum by rows I want to get a normal distribution for the sum column. Can we do it in R? Here is a random sample to demonstrate.

  m1    m2  m3  m4  sum
    1   1   0   1   3
    1   1   0   1   3
    1   0   0   0   1
    0   1   0   0   1
    0   0   1   0   1
    0   1   1   0   2
    1   0   1   1   3
    0   0   1   1   2
    0   0   1   0   1
    1   0   0   1   2
    1   0   0   0   1
    1   0   0   0   1
    1   0   1   1   3

Solution

  • This might be what you were asking for:

    data <- data.frame(m1=numeric(),m2=numeric(),m3=numeric(),m4=numeric())
    data[1:300,1] <- sample(0:1,300,replace = TRUE)
    data[1:300,2] <- sample(0:1,300,replace = TRUE)
    data[1:300,3] <- sample(0:1,300,replace = TRUE)
    data[1:300,4] <- sample(0:1,300,replace = TRUE)
    data$sum <- data[,1] + data[,2] + data[,3] + data[,4]
    

    plot(density(data$sum,bw = 2))enter image description here
    EDIT
    distribution: plot(table(data$sum))enter image description here