Search code examples
wolfram-mathematicamathematica-8

How do you work out Conditional Probabilities in Mathematica. Is it possible?


Can Mathematica do Bayes Rule conditional probability calculations, without doing the calculation manually? If so how?

I have been searching both the Mathemtaica doco and the web for a hint but cannot find anything. I am not after how to do Bayes Rule manually via Mathematica, I want to know if there is a way to define the conditional probabilities and calculate other ones automagically.

So to use the toy example assuming Bernoulli distributions

P(Cancer+) = 0.01
P(Cancer-) = 0.99

P(Test+|Cancer+) = 0.9
P(Test-|Cancer+) = 0.1
P(Test+|Cancer-) = 0.2
P(Test-|Cancer-) = 0.8

Is it possible to work out

P(Cancer+|Test+) = 0.0434

So using the below.

Print["P(C+) = ", PCancerT=BernoulliDistribution[0.01]];
Print["P(C-) = ", PCancerF=BernoulliDistribution[0.99]];
Print[]
Print["P(T+|C+) = ", PTestTGivenCancerT=BernoulliDistribution[0.9]];
Print["P(T-|C+) = ", PTestFGivenCancerT=BernoulliDistribution[0.1]];
Print["P(T+|C-) = ", PTestTGivenCancerF=BernoulliDistribution[0.2]];
Print["P(T-|C-) = ", PTestFGivenCancerF=BernoulliDistribution[0.8]];
Print[]
Print["P(T+,C+) = ", PTestTAndCancerT = Probability[vCT&&vTTCT,{vCT\[Distributed]PCancerT,vTTCT\[Distributed]PTestTGivenCancerT}]];
Print["P(T-,C+) = ", PTestFAndCancerT = Probability[vCT&&vTFCF,{vCT\[Distributed]PCancerT,vTFCF\[Distributed]PTestFGivenCancerT}]];
Print["P(T+,C-) = ", PTestTAndCancerF = Probability[vCF&&vTTCF,{vCF\[Distributed]PCancerF,vTTCF\[Distributed]PTestTGivenCancerF}]];
Print["P(T-,C-) = ", PTestFAndCancerF = Probability[vCF&&vTTCF,{vCF\[Distributed]PCancerF,vTTCF\[Distributed]PTestFGivenCancerF}]];
Print[]
Print["P(C+|T+) = ?"];
Print["P(C+|T-) = ?"];
Print["P(C-|T+) = ?"];
Print["P(C-|T-) = ?"];

I can work out the joint probabilities by defining all the probability tables manually, but is there a way to get Mathematica to do the heavy lifting? Is there a way to define and calculate these kind of conditional probabilities?

Many thanks for any assistance, even it its “You can’t... stop trying” :)

PS : was this an attempt at doing something along these lines? Symbolic Conditional Expectation in Mathematica


Solution

  • Actually... I worked this out symbolically in the past, and it covers a lot of simple (unchained) probabilities. I guess it wouldn't be that hard to add chaining(see below). You're welcome to reply with augmentation. The symbolic approach is far more flexible than working with Bernoulli distributions and creating a proc for Bayes theorem and thinking about the right way to apply it every time.

    NOTE: The functions are not bound, like in the post above ((0 < pC < 1) && (0 < pTC < 1) && (0 < pTNC < 1)) because sometimes you want "unweighted" results, which produce numbers outside of 0-1 range, then you can bring back into the range by dividing by some normalizing probability or product of probabilities. If you do want to add bounds for error checking, do this:
    P[A_ /;0<=A<=1] := some_function_of_A;

    use Esc+cond+Esc to enter \\[Conditioned] symbol in Mathematica.

    Remove[P];
    Unprotect@Intersection;
    Intersection[A_Symbol, B_Symbol] := {A, B}
    Intersection[A_Not, B_Symbol] := {A, B}
    Intersection[A_Symbol, B_Not] := {A, B}
    P[Int_List/; Length@Int == 2] := P[Int[[2]] \[Conditioned] Int[[1]]] P[Int[[1]]]
       (*//  P(B) given knowledge of P(A)  //*)
    P[B_, A_] := If[NumericQ@B, B, 
                    P[B \[Conditioned] A] P[A] + P[B \[Conditioned] Not@A] P[Not@A]]
    P[Not@B_, A_: 1] := If[NumericQ@A, 1 - P[B], 1 - P[B, A]]
    P[A_ \[Conditioned] B_] := P[A \[Intersection] B]/P[B, A]
    P[Not@A_ \[Conditioned] B_] := 1 - P[A \[Conditioned] B];
    

    You then use it as such:

    P[Cancer]=0.01;
    

    Don't need "not cancer" since P[!Cancer] yields 0.99 (Esc+not+Esc types a very pretty logical not symbol, but Not[A], !A or \[Not]A work just fine too)

    P[Test \[Conditioned] Cancer] = 0.9
    P[Test \[Conditioned] ! Cancer] = 0.2
    

    again: P[!Test \\[Conditioned] Cancer] will be 1-P[Test \\[Conditioned] Cancer] by definition, unless you override it.

    Now let's query this model:

    P[Test, Cancer]
    P[!Test, Cancer]
    

    returns

    0.207
    0.793
    

    and

    P[Cancer \[Conditioned] Test]
    P[!Cancer \[Conditioned] Test]
    P[Cancer \[Conditioned] !Test]
    P[!Cancer \[Conditioned] !Test]
    

    returns

    0.0434783
    0.956522
    0.00126103
    0.998739
    

    I guess it would be a nice idea to define P(B|A1,A2,A3,...,An), anyone up for coding the chain rule using NestList or something like it? I didn't need it for my project, but it wouldn't be that difficult to add, should someone need it.