Search code examples
rrounding-errorlog-likelihoodmpfr

Rounding error in density functions of R-programming language


Using R I am trying to compute the likelihoods of a vector of values. Some of these values are way off into the tails of the distributions. Rounding error appears to be causing the results to be rounded to zero, making my code throw an error when drawing samples.

I looked into using the Rmpfr package to get higher precision, but this changes my variable type. I also considered rewriting the expression into a log-form to avoid multiplication. Then the expression would become: exp(log(w[X]) + dnorm(y, x[X], sigma[X], log = TRUE))

This still causes the function to return a zero, due to the exponentiation.

Would there be a way to formulate this problem with logs, allowing for high numerical precision? What I want in the end is for the following probabilities to be correctly computed, summing to one.

  liks <- lapply(1:3, function(X) {
    w[X] * dnorm(y, x[X], sigma[X])
  })

  probs1 <- liks[[1]]/(liks[[1]] + liks[[2]] + liks[[3]])
  
  probs2 <- liks[[2]]/(liks[[1]] + liks[[2]] + liks[[3]])
  
  probs3 <- liks[[3]]/(liks[[1]] + liks[[2]] + liks[[3]])

Edit: Adding Numerical Example of Log Code

Here are some parameter and sample values that start throwing errors with the log-code. They do not throw NANs. Instead each of the three entries is zero:

$mu
[1] 0.7323412910 0.7742235621 0.4863889347

$w
[1] 0.008464 0.083536 0.908000

$sigma
[1] 0.08209500030 0.08166088502 0.09168991045

Observation values:
c(4.667935371,  5.654500961,  4.383309364,  4.396201611,  4.452524185,  4.441100597,  4.890487194,  4.416962624,  5.241273880,  4.347382069,  4.867616177,  4.895996094,  4.592288494, -3.612523079,  4.817468166,  4.783963203,  4.541391850,  4.709537983,  5.227987289,  5.585811138,  4.497674942,  4.989979267,  4.489729881)

All the observation values are rather extreme / outliers in my dataset. This would explain why they are assigned probabilities so small that they are rounded to zero.


Solution

  • You should work with logs throughout, don't take the exponential so soon. For example, here's a change:

    logliks <- lapply(1:3, function(X) {
      log(w[X]) +  dnorm(y, x[X], sigma[X], log = TRUE)
     })
    

    Now, to evaluate expressions like probs1, you want to divide numerator and denominator by the biggest of the liks values, i.e. compute

    (liks[[1]]/biggest)/(liks[[1]]/biggest + liks[[2]]/biggest + liks[[3]]/biggest)
    

    but do it all on the log scale:

    logbiggest <- max(as.numeric(logliks))
    logprobs1 <- (logliks[[1]] - logbiggest) - 
      log( exp( logliks[[1]] - logbiggest ) 
        + exp(logliks[[2]] - logbiggest) 
        + exp(logliks[[3]] - logbiggest) )
    

    and similarly for logprobs2 and logprobs3. Since logbiggest is equal to one of the logliks, one of those exponentials will equal 1.0, and then it doesn't matter if the other ones underflow.

    Edited to add: numerical example

    You added data to your question. Here is the full calculation using your data. I don't get any zero probabilities, but most of them are very small:

    x <- c(0.7323412910, 0.7742235621, 0.4863889347)
    w <- c(0.008464, 0.083536, 0.908000)
    sigma <- c(0.08209500030, 0.08166088502, 0.09168991045)
    
    y <- c(4.667935371,  5.654500961,  4.383309364,  4.396201611,  4.452524185,  4.441100597,  4.890487194,  4.416962624,  5.241273880,  4.347382069,  4.867616177,  4.895996094,  4.592288494, -3.612523079,  4.817468166,  4.783963203,  4.541391850,  4.709537983,  5.227987289,  5.585811138,  4.497674942,  4.989979267,  4.489729881)
    
    probs <- matrix(NA, length(y), 3)
    
    for (i in seq_along(y)) {
      logliks <- lapply(1:3, function(X) {
        log(w[X]) +  dnorm(y[i], x[X], sigma[X], log = TRUE)
      })
      
      logbiggest <- max(as.numeric(logliks))
      
      logprobs1 <- (logliks[[1]] - logbiggest) - 
        log( exp( logliks[[1]] - logbiggest ) 
             + exp(logliks[[2]] - logbiggest) 
             + exp(logliks[[3]] - logbiggest) )
      
      logprobs2 <- (logliks[[2]] - logbiggest) - 
        log( exp( logliks[[1]] - logbiggest ) 
             + exp(logliks[[2]] - logbiggest) 
             + exp(logliks[[3]] - logbiggest) )
      
      logprobs3 <- (logliks[[3]] - logbiggest) - 
        log( exp( logliks[[1]] - logbiggest ) 
             + exp(logliks[[2]] - logbiggest) 
             + exp(logliks[[3]] - logbiggest) )
      
      probs[i, ] <- exp(c(logprobs1, logprobs2, logprobs3))
    }
    
    head(probs)
    #>              [,1]         [,2] [,3]
    #> [1,] 4.006975e-50 9.059815e-44    1
    #> [2,] 1.964023e-93 2.172199e-87    1
    #> [3,] 6.103586e-40 1.274113e-33    1
    #> [4,] 2.221979e-40 4.668064e-34    1
    #> [5,] 2.538854e-42 5.467829e-36    1
    #> [6,] 6.336084e-42 1.358276e-35    1
    

    Created on 2023-03-26 with reprex v2.0.2

    The last one is equal to 1 because of rounding; the others are very small numbers. This makes sense, because the y values are so much larger than the means, and the 3rd distribution has the largest variance: so the model predicts that all those outliers are likely drawn from that distribution. Multiply sigma by 10 and you'll get less extreme probabilities.