Search code examples
algorithmmathprobability

why different method lead to different outcome when calculate a probability problem


The question is

"The winning probability for each ticket is 0.005. You can get 25 lottery tickets for free. After the free ticket, you need to pay $9.6 to buy a ticket.

You are guaranteed to win once you get 250 draws(That means if someone is very unlucky, he bought 249 ticket and dosen't get any reward. Then when he bought the 250th ticket the store can give him a reward directly).

Please calculate the average cost when you win a ticket."

method 1 (finially get-->525.48):

for i in range(26, 250):
        probability_win = (0.995 \*\* (i - 1)) \* 0.005  #win at this draw
        cost_this_draw = 9.6 \* (i - 25)  # the overall cost at this time
        revised_total_cost += probability_this_draw \* cost_this_draw
return revised_total_cost

method 2 (finally get-->1014.5):

max_draws = 250  
free_draws = 25  
cost_per_draw = 9.6
win_probability = 0.005
cumulative_probability_not_winning = 1.0
expected_total_cost = 0.0

for draw in range(1, max_draws + 1):
    if draw > free_draws:
    # only add cost when over the free draws
        expected_total_cost += cost_per_draw \* (1 - cumulative_probability_not_winning)

    # update the probability of not win at next time
    cumulative_probability_not_winning *= (1 - win_probability)
return expected_total_cost

I know the first method is to calculate by the probability distribution. Each time is individual. But I don't understand why the second method is different. Could someone help me with this question? Thanks sooo much.


Solution

  • Your first approach is wrong because it doesn't treat "Winning after exactly 250 draws" as a special case.

    The probability of "Winning after exactly 250 draws" is not

    0.005 * (1 - 0.005)^249
    

    Due to the "free win at 250 draws" the probability is to be calculated as

    1 - P("You didn't already win a previous draw")
    

    Your second approach is also wrong. I don't really understand the logic behind the method so I can't spot the problem (the price calculation seems wrong, not sure). But the correct result is 1145.496 so there is something wrong with the logic.

    FWIW, here is a C implementation to calculate it:

    #define PRICE 9.6
    #define PROB_WIN 0.005
    #define FREE_TICKETS 25
    #define DRAW_LIMIT 250
    
    double prob_no_win_before_draw(int draw)
    {
      return pow(1-PROB_WIN, draw-1);
    }
    
    double prob_win_in_draw(int draw)
    {
      return PROB_WIN * prob_no_win_before_draw(draw);
    }
    
    double price(int draw)
    {
      return PRICE * (draw - FREE_TICKETS);
    }
    
    int main(void)
    {
      double average_price = 0.0;
    
      for (int x = FREE_TICKETS + 1; x < DRAW_LIMIT; ++x)
      {
        average_price += prob_win_in_draw(x) * price(x);
      }
      // Special case for DRAW_LIMIT
      average_price += prob_no_win_before_draw(DRAW_LIMIT) * price(DRAW_LIMIT);
    
      printf("average_price %.08f\n", average_price);
    
      return 0;
    }
    

    Output:

    average_price 1145.49573578