Search code examples
rprobability

Weird results when using the probability function from the discreteRV package


library(discreteRV)
X <- RV(c(0, 9), c(1/2, 1/2))
Y <- RV(c(-3, 1), c(1/7, 6/7))

P1 <- P(X - Y > 0) # P1 = 0.57
P2 <- P( X - Y < 0 | X > 0) # P2 = 0
P3 <- P( X - Y < 0 | Y <= 0) # P3 = 3
P4 <- P(cos(pi * X * Y) < 1/2) # doesn't compile 
P5 <- P(X ** 2 + 3 * Y >= 3) # P5 = 0.9285
P6 <- P(X - Y < X ** 2 + 3 * Y) # P6 = 0 

When I am using the built-in function P from the package discreteRV, I get some really weird results. I've also tried a different approach and used the function sample to create a discrete random variable and the results seem ok

Xpmf <- c(1/2, 1/2)
X <- sample(c(0, 9), size = 10000, replace = TRUE, prob = Xpmf)
Ypmf <- c(1/7, 6/7)
Y <- sample(c(-3, 1), size = 10000, replace = TRUE, prob = Ypmf)
P1 <- mean(X - Y > 0) # P1 = 0.57
P2 <- mean( X - Y < 0 | X > 0) # P2 = 0.92
P3 <- mean( X - Y < 0 | Y <= 0) # P3 = 0.56 
P4 <- mean(cos(pi * X * Y) < 1/2) # P4 = 0.50
P5 <- mean(X ** 2 + 3 * Y >= 3) # P5 = 0.92
P6 <- mean(X - Y < X ** 2 + 3 * Y) # P6 = 0.92 

Solution

  • When you do:

    Xpmf <- c(1/2, 1/2)
    X <- sample(c(0, 9), size = 10000, replace = TRUE, prob = Xpmf)
    Ypmf <- c(1/7, 6/7)
    Y <- sample(c(-3, 1), size = 10000, replace = TRUE, prob = Ypmf)
    P2 <- mean( X - Y < 0 | X > 0) # P2 = 0.92
    

    you do not calculate an approximation of the conditional probability P( X - Y < 0 | X > 0). The | in mean( X - Y < 0 | X > 0) is a logical OR, this is not a conditioning. It is easy to check that X-Y is never <0 when X>0, so the correct value of P2 is 0.

    The probability higher than 3 sounds like a bug. Or maybe you need to set the joint distribution with jointRV, I don't know whether discreteRV assumes independence by default.


    EDIT

    The package does not assume independence by default:

    > P((X == 0) %AND% (Y == 1)) # should be 1/2*6/7 if independence
    [1] 0
    

    So you have to use jointRV.

    __

    EDIT

    You can specify independence as follows:

    XandY <- jointRV(
      outcomes = list(c(0,9), c(-3,1)), 
      probs = c(t(outer(c(1/2,1/2), c(1/7,6/7))))
    )
    X <- marginal(XandY, 1)
    Y <- marginal(XandY, 2)
    

    However that does not solve the issue:

    P( X - Y < 0 | Y <= 0) # still 3
    

    That's because X-Y and Y are not defined on the same sample space.

    You can get this conditional probability as follows:

    XminusY_and_Y <- joint(X-Y, Y)
    XminusY <- marginal(XminusY_and_Y, 1)
    Y <- marginal(XminusY_and_Y, 2)
    P(XminusY < 0 | Y <= 0) # 0.3673469
    

    Not highly convenient...