Search code examples
rprobabilitymontecarlo

Monte Carlo R function help for finding probability (balls from urn problem)


I'm trying to answer the following question using a simple Monte Carlo sampling procedure in R: An urn contains 10 balls. Two red, three white, and five black. All 10 are drawn one at a time without replacement. Find the probability that both the first and last balls drawn are black.

I've tried two approaches and neither work.

Here's the longer approach that's more intuitive to me:

balls <- c(1:10) #Consider 1-5 black, 6-8 white, and 9-10 red.

pick.ball <- function(balls){
sample(x = balls, 1, replace = FALSE)
}

experiment <- function(n){
picks = NULL
keep <- NULL
for(j in 1:n){
   for(i in 1:10){
   picks[i] <- pick.ball(balls = balls)
   }
keep[j] <- ifelse(picks[1] == any(1:5) & picks[10] == any(1:5), 1, 0)
 }
return(length(which(keep == 1))/n)
}

Here's my second, simpler approach which shows my lack of understanding of the repeat loop. Don't bother running it--it just goes on forever. But if someone could help me better understand why, that would be appreciated!

balls <- c(1:10) #Consider 1-5 black, 6-8 white, and 9-10 red.

pick.ball <- function(balls, n){
  keep = NULL
  for(i in 1:n){
  picks <- sample(x = balls, 10, replace = FALSE)
  keep[i] <- ifelse(picks[1] == any(1:5) & picks[10] == any(1:5), 1, 0)
  repeat{
    picks
    if(length(keep) == n){
      break
      }
    }
  }
  return(which(keep == 1)/n)
}

Solution

  • Here is a loop that I created. You can wrap it up in a function if you want. Instead of numbering the balls, I am using letters.

    urn <- c(rep("B", 5), rep("W", 3), rep("R", 2))
    
    # Set the number of times you want to run the loop
    
    nloops <- 10000
    
    
    # Create an empty data frame to hold the outcomes of the simulation
    
    m <- structure(list(first = character(),
                        last = character(),
                        is_black = integer()),
                   class = "data.frame")
    
    

    Now run the loop

    
    set.seed(456)
    for (j in 1:nloops) {
      b <- sample(urn, 10, replace = FALSE)
      m[j, 1:2 ] <- b[c(1, 10)] 
      m[j, 3] <- ifelse(m[j, 1] == "B" & m[j, 2] == "B", 1, 0)
    }
    
    head(m)
    
      first last is_black
    1     B    W        0
    2     B    B        1
    3     B    B        1
    4     R    B        0
    5     B    R        0
    6     R    W        0
    

    Finally, the answer:

    # Proportion of cases where first and last ball drawn were black
    
    sum(m[ , 3]) / nloops
    
    # This was 0.22