Search code examples
rlistvectordplyrsample

Draw from a bag of colored marbles; for each draw remove all marbles of that color


I want to draw from a bag of colored marbles without replacement following some simple rules. There are multiple marbles of the same color (e.g., 5 blue, 3 red, 7 yellow, 4 green). Let say I draw 3 marbles, one marble at a time. After each draw, I remove all marbles of that color. E.g., I pick a green, I remove all green marbles from the bag; I pick a red, I remove all red marbles from the bag, etc.

I'm not totally clear on how to most efficiently remove all the marbles of the same color as the focal draw, without tons of for-loops. The below dummy code only draws marbles according a vector of draws.

#Dummy code
set.seed(123)
multiple_draws <- c(3,2,4,1)
bag <- c(rep("blue",5),rep("red",3),rep("yellow",7),rep("green",4))

sapply(seq(length(multiple_draws)), function(i) sample(bag, multiple_draws[i],replace=F), simplify=F) 

Any pointer would be much appreciated.


Solution

  • Your data

    set.seed(123)
    multiple_draws <- c(3,2,4,1)
    bag <- c(rep("blue",5),rep("red",3),rep("yellow",7),rep("green",4))
    

    Convert your vector to a table of proportions using prop.table(table(...))

    prop.table(table(bag))
    
    # bag
         # blue     green       red    yellow 
    # 0.2631579 0.2105263 0.1578947 0.3684211
    

    You can sample the unique values in your bag setting the probability to the proportions

    custom_sample <- function(vec, T) {
        sample(names(prop.table(table(vec))), T, replace=FALSE, prob=prop.table(table(vec)))
    }
    lapply(multiple_draws, function(T) custom_sample(bag, T))
    # [[1]]
    # [1] "yellow" "red"    "blue"  
    
    # [[2]]
    # [1] "red"   "green"
    
    # [[3]]
    # [1] "yellow" "green"  "red"    "blue"  
    
    # [[4]]
    # [1] "blue"