Search code examples
rprobabilitysample

R iteratively select and remove values from a vector


I want ot iteratively select a number of values from a vector (based on a coin-toss decision probability) and remove them from that vector, then in the next loop of iteration, I again want to select (following coin-toss) values from the remaining vector values. Until I will reach the point that my vector is empty. Following is the solution in my mind but at the end I get stuck with one non-selected value in the vector:

vector <- c("item1", "item2", "item3", "item4", "item5", "item6", "item7", "item8", "item9", "item10")
  for (i in 1:10) {
    #select values from the vector based on coin-toss probability, so that roughly half of the items get selected
  selected <- sample(vector, replace = F, size = length(vector)/2)
  print(slected)
  # Do some operation with slected values

  # remove the selcted values from the original vector
  vector <- vector[!vector%in%selected]
  print(vector)
  # as we are in loop this will keep happening until we are done selecting all of the elements in the vector.
  }

NOTE: I don't want to select any value twice!

Can anybody guide me what can be a better solution for this.

EDIT: Can there be a coin-toss based selection, where I don't give the size explicitly. For example, for every value in the vector we calculate a probability of selection and if it is higher than 0.5 then that value get selected and not otherwise.

I want to do it this way because I want to iterate over this vector 1000 of times and I expect to get different results based on different kind of selection in every iteration.


Solution

  • Here's a different solution. Notice that the most important change is the use of ceiling in defining the sample size.

    x <- c("item1", "item2", "item3", "item4", "item5", "item6", "item7",
           "item8", "item9", "item10")
    
    while(length(x) > 0) {
      selected <- sample(x, replace = FALSE, size = ceiling(length(x)/2))
      cat("selected:", selected, "\n")
      x <- x[!x %in% selected]
      cat("remaining:", x, "\n\n")
    }
    
    selected: item5 item3 item8 item10 item4 
    remaining: item1 item2 item6 item7 item9 
    
    selected: item1 item2 item9 
    remaining: item6 item7 
    
    selected: item6 
    remaining: item7 
    
    selected: item7 
    remaining: 
    

    I also used a while loop instead of OP's for loop since that makes more sense conceptually.


    Regarding OP's comments:

    You can also try something like the following where you don't define the sample size that is being selected. Note however, that this can easily lead to some cases where no or all elements are selected, even though the probabiliy for each element is 0.5:

    x <- c("item1", "item2", "item3", "item4", "item5", "item6", "item7", 
           "item8", "item9", "item10")
    while(length(x) > 0) {
      selected <- x[sample(c(TRUE, FALSE), size = length(x), replace = TRUE)]
      cat("selected:", selected, "\n")
      x <- x[!x %in% selected]
      cat("remaining:", x, "\n\n")
    }