I'm trying to test the addition/multiplication rule :
P(A or B) = P(A) + P(B) - P(A and B)
P(A and B) = P(A) * P(B)
(or=union, and=intersect)
by picking a simple example of throwing two dices: d1 = d2 = {1,...,6}
A = picking 1 in d1
B = picking an even number in d2
using numpy i generated the two sets and calculated it like this :
d1 = np.arange(1, 7)
d2 = np.arange(1, 7)
np_1 = d1[d1 == 1].size / d1.size # == 1/6
np_even = d2[d2 % 2 == 0].size / d2.size # == 1/2
np_1_or_even = (
np_1
+ np_even
- (np_1 * np_even * (np.intersect1d(d1[d1 == 1], d2[d2 % 2 == 0]).size > 0))
)
Fraction(np_1_or_even).limit_denominator(100)
{1} and {2,4,6} never intersect so the previous formula is probably just P(1) + P(even) - 0
result is 2/3
then by trying to leverage pandas i generated all possible outcomes and calculated it like this :
dice_outcomes = np.arange(1, 7)
two_dices_outcomes = np.array([(i, j) for i in dice_outcomes for j in dice_outcomes])
df = pd.DataFrame(columns=["d1", "d2"], data=two_dices_outcomes)
r = df[(df["d1"] == 1) | (df["d2"] % 2 == 0)].size / df.size
Fraction(r).limit_denominator(100)
result is 7/12
is any of these methods even correct and can someone help me understand what i missed please ?
Thx in advance
{1} and {2,4,6} never intersect so the previous formula is probably just P(1) + P(even) - 0
That isn't what P(A and B)
means, what it means is, the probability that both events A
and B
happen (assuming the events are not mutually exclusive), which means, in your example, that after rolling the two dices, dice 1 shows a value of 1, and dice 2 shows an even value (e.g. d_1 = 1 and d_2 = 4
).
So this part in the NumPy code is incorrect and should be omitted:
* (np.intersect1d(d1[d1 == 1], d2[d2 % 2 == 0]).size > 0)
After omitting that part, the NumPy result is also 7 / 12
, which is pandas's result as well, which is the correct result.