I want to create 4 binary variables with a sample of 300 (assume I may want to increase 4 to 10 variables). But when I sum by rows I want to get a normal distribution for the sum column. Can we do it in R? Here is a random sample to demonstrate.
m1 m2 m3 m4 sum
1 1 0 1 3
1 1 0 1 3
1 0 0 0 1
0 1 0 0 1
0 0 1 0 1
0 1 1 0 2
1 0 1 1 3
0 0 1 1 2
0 0 1 0 1
1 0 0 1 2
1 0 0 0 1
1 0 0 0 1
1 0 1 1 3
This might be what you were asking for:
data <- data.frame(m1=numeric(),m2=numeric(),m3=numeric(),m4=numeric())
data[1:300,1] <- sample(0:1,300,replace = TRUE)
data[1:300,2] <- sample(0:1,300,replace = TRUE)
data[1:300,3] <- sample(0:1,300,replace = TRUE)
data[1:300,4] <- sample(0:1,300,replace = TRUE)
data$sum <- data[,1] + data[,2] + data[,3] + data[,4]
plot(density(data$sum,bw = 2))
EDIT
distribution:
plot(table(data$sum))