Search code examples
rsimulationprobabilitydata-science

Monte Carlo simulation in R for Monty Hall problem not working?


I'm writing a function in R to perform a Monte Carlo simulation for the Monty Hall problem. The function is working when the doors are not switched it switch == FALSE, but when I call mean(replicate(10000, monty_hall(switch = TRUE))), the expected answer is about 0.66 but I actually get around 0.25.

Here is the code to the function:

monty_hall = function(switch = logical()){
    doors <- c(1,2,3)
    names(doors) <- rep(c("goat", "car"), c(2,1))
    prize_door <- doors[3]

    guess <- sample(doors, 1)
    revealed_door <- sample(doors[!doors %in% c(guess, prize_door)],1)
    if(switch){
        switched_door <- sample(doors[!doors %in% c(guess, revealed_door)],1)
        prize_door == switched_door
    } else {
        prize_door == guess
        }
}

What changes should I make to get the correct output, which is around 0.66?


Solution

  • Just change the doors vector to characters

    monty_hall = function(switch = logical()){
       doors <- c("1","2","3")
       names(doors) <- rep(c("goat", "car"), c(2,1))
       prize_door <- doors[3]
    
       guess <- sample(doors, 1)
       revealed_door <- sample(doors[!doors %in% c(guess, prize_door)],1)
       if(switch){
          switched_door <- sample(doors[!doors %in% c(guess, revealed_door)],1)
          prize_door == switched_door
       } else {
          prize_door == guess
       }
    }
    

    Suppose the person chose door number 1 and the prize is in door number 2, so what is left to be revealed is door number 3.

    You will have revealed_door <- sample(3,1) and this doesn't work as you are expecting, this becomes revealed_door <- sample(c(1,2,3),1)

    From the function documentation, just type ?sample

    If x has length 1, is numeric (in the sense of is.numeric) and x >= 1, sampling via sample takes place from 1:x. Note that this convenience feature may lead to undesired behaviour when x is of varying length in calls such as sample(x)

    I think the easiest fix is changing to characters, but if you must use numerical values just do a check for the vector's length and return the value if it is 1, or do a sample otherwise