Search code examples
rrandomvectorsample

Random assignment 1s and 0s with a maximum


I have a dataset with two different columns (X and Y) that both contains the exact same amount of 0s and 1s:

0     1 
3790  654

Now I want to have column Y to contain an exact amount of 1733 1s and 2711 0s. But the 1079 extra 1s (1733-654) must be assigned randomly. I already tried the following:

ind <- which(df$X == 0)
ind <- ind[rbinom(length(ind), 1, prob = 1079/3790) > 0]
df$Y[ind] <- 1

But if I run this code, there is everytime a different number of 1s, and I want it to be exactly 1733 if I run it. How do I do this?


Solution

  • You have this vector:

    x <- sample(c(rep(0, 3790), rep(1, 654)))
    
    #> table(x)
    #> x
    #>    0    1 
    #> 3790  654 
    

    What you need to do is randomly select the position of 1079 elements in your vector that equals 0, and assign them the value 1:

    s <- sample(which(x == 0), 1079)
    x[s] <- 1
    
    #> table(x)
    #> x
    #>    0    1 
    #> 2711 1733