Search code examples
pythonnumpymontecarlo

How to revise Monte Carlo Simulation code for Coupon Collector problem


The question I'm solving is if I receive 1/5 coupons at random a day, how many days will it take to collect 2 copies of each of the 5 coupons?

My code is only checking for 1 of each coupon instead of two but I'm not sure where I can make revisions. I edited the while True statement to include a list to my cards_own statement like this cards_own != [1,1,2,2,3,3,4,4,5,5] but then my output didn't print at all. I think maybe using the random.choice function would be a better option. Any recommendations?

import numpy as np

# number of simulations
n =10000 

# defining function to check if we got all 5 cards indexed from 0 to 4
def all_cards(arr2,arr1=np.array(range(5))):
  return np.array_equal(arr1,arr2)  
days_taken = 0

# performing simulation
for i in range(n):

  # initializing empty list to store card obtained in each day
  cards_own =[]

  # days taken in each simulation
  days =1

  # looping until i get 5 cards and count the days taken for finding average
  while True:
    value = np.random.randint(0,5)
    cards_own.append(value)

    if len(cards_own)>=5:
      if all_cards(arr2=np.array(list(set(cards_own)))):
        days_taken+=days
        break 
    days+=1
  
average = days_taken/n

# recording the result in result with given precision 
result = round(average,2)
print("Average days taken = ",result," days")

Solution

  • So there are a few things that I think you could change here. First off you can change your if statement to read if len(cards_own) >= 10 since you need at least ten coupons to have 2 of everything. Additionally, in your cards_all function, you can loop through all of the possible coupons (in this case [0, 4] and check if there are at least two occurences for each of them in the array (so you don't need arr2). Something like:

    for i in range(5):
        if np.sum(arr1 == i) < 2:
            return False
    return True
    

    the if statement is just counting how many times the element i appears in the array. Also you should not be converting arr1 to a set in the function call since sets cannot have duplicate values so checking for 2 of each coupon would never work like that. Hopefully this helps!