Search code examples
pythonpandasnumpymathprobability

probability addition/multiplication rule & python


I'm trying to test the addition/multiplication rule :

P(A or B) = P(A) + P(B) - P(A and B)

P(A and B) = P(A) * P(B)

(or=union, and=intersect)

by picking a simple example of throwing two dices: d1 = d2 = {1,...,6}

A = picking 1 in d1

B = picking an even number in d2

using numpy i generated the two sets and calculated it like this :

d1 = np.arange(1, 7)
d2 = np.arange(1, 7)
np_1 = d1[d1 == 1].size / d1.size # == 1/6
np_even = d2[d2 % 2 == 0].size / d2.size # == 1/2
np_1_or_even = (
    np_1
    + np_even
    - (np_1 * np_even * (np.intersect1d(d1[d1 == 1], d2[d2 % 2 == 0]).size > 0))
)
Fraction(np_1_or_even).limit_denominator(100)

{1} and {2,4,6} never intersect so the previous formula is probably just P(1) + P(even) - 0

result is 2/3

then by trying to leverage pandas i generated all possible outcomes and calculated it like this :

dice_outcomes = np.arange(1, 7)
two_dices_outcomes = np.array([(i, j) for i in dice_outcomes for j in dice_outcomes])
df = pd.DataFrame(columns=["d1", "d2"], data=two_dices_outcomes)
r = df[(df["d1"] == 1) | (df["d2"] % 2 == 0)].size / df.size
Fraction(r).limit_denominator(100)

result is 7/12

is any of these methods even correct and can someone help me understand what i missed please ?

Thx in advance


Solution

  • {1} and {2,4,6} never intersect so the previous formula is probably just P(1) + P(even) - 0

    That isn't what P(A and B) means, what it means is, the probability that both events A and B happen (assuming the events are not mutually exclusive), which means, in your example, that after rolling the two dices, dice 1 shows a value of 1, and dice 2 shows an even value (e.g. d_1 = 1 and d_2 = 4).

    So this part in the NumPy code is incorrect and should be omitted:

     * (np.intersect1d(d1[d1 == 1], d2[d2 % 2 == 0]).size > 0)
    

    After omitting that part, the NumPy result is also 7 / 12, which is pandas's result as well, which is the correct result.