Search code examples
rsample

How to generate a random vector with fixed length and fixed proportion of two values?


I wanted to generate a random vector that contains only two possible values: "FEMALE" and "MALE". I also wanted the vector has fixed length and EXACT fixed % of each value.

I tried the below code. It worked well except that it didn't give me the Exact %

> x1 <- sample(c("FEMALE", "MALE"), size = 19749, replace = TRUE, prob=c(0.538, 0.462))

> length(x1)
[1] 19749

> x2 <- table(x1)

> prop.table(x2)
x1
   FEMALE      MALE 
0.5410401 0.4589599

Anyone knows why I didn't get exact % of FEMALE and MALE in vector x1? And, how to fix the code to get exact %?


Solution

  • First create the vector with required number of values and then sample

    n = 19749
    x1 <- sample(c(rep("FEMALE", .538 * n),
                   rep("MALE", .462 * n)))
    prop.table(table(x1))
    x1
    #   FEMALE      MALE 
    #0.5379785 0.4620215