Search code examples
rperformanceexponential

Handling very small numbers in ratio and how to keep exponential value


I am currently using R version 3.4.4 (2018-03-15) with R Studio.

I need to calculate ratio of two values. And I have problems with some case :

  • The numerator can be very small value : exp(-2408.9) that R approximate with 0.
  • The denominator also : exp(-2405) is calculated as 0 is R.

When the ratio is computed, I get a NaN (because of 0/0).

First solution :

I use the Brobdingnag library that allows to keep number as exponentiel, and finally obtain that the ratio actually is : exp(-3.8987) = 0.02026725

But, checking at the performance of my code with the library profvis, I can see that despite the fact that the Brobdingnag library is very useful in my case, it cost me a lot in term of performance. And I cannot keep this solution, because I have to do a lot of simulations of my algorithm.

Questions for an other solution :

Have you heard about an other library to deal with very small (or large) values ?

I would like to keep my numerator and denominator in an exponential expression until the division is made, but I have no idea of how to do it. Because of course, my numerator and denominator are vectors, that I divide once they are both calculated. (I can not obtain denominator without the numerator vector) Is there a way to "force" R to keep a value as exp instead of integer (and 0...) ?

Thank you in advance for any help.

EDIT :

Here is the ratio I have to calculate :

https://ibb.co/dFHx4z

I am not sure I can use the trick : exp(x)/exp(y) = exp(x-y) because I have a sum into the denom. That's why I need the exp formula until I do the ratio... Value inside the exp are very large negative number, and exp of these numbers make 0. Plus, I tried to transform numerator as log, so I can have log of firt past + second part (without exp) but sometimes, first part of numerator (1/sqrt...) is to small and log of it returns Inf..

I think there is a way, but I can' find it.

Thanks for all the answers btw!

EDIT 2 :

####### Fonction that calculate the density (with brobdingnag package) :

density <- function(nc,yc,X,beta,sig,k){

    # n_c is a vector of integer 
    # y_c is a vector of numeric 
    # X is a matrix 
    # beta is a vector of numeric 
    # sigma is a value

res<-as.brob((1/(2*pi*sig[k])))^(nc/2)*exp(as.brob(-(1/(2*sig[k]))*t(yc-(X %*% beta[,k])) %*% (yc-(X %*% beta[,k]))))
return(res)
}

####### Code for calculation of the ratio :

# n_c[c] : num [1] 340
# y_c[c] : num [1:340] 1.279 0.777 1.069 0.864 1.56 ...
# X[c] : num [1:340, 1:11] 1 1 1 1 1 1 1 1 1 1 ... (matrix of 0 and 1)
# beta : num [1:11, 1:2] 1.542 -0.226 -0.145 -0.438 -0.201 ...
# sigma : num [1:2] 21.694381  4.267277
# lambda : num [1] 0.5

# Numerator :

num_tau<-sapply(1:100,function(c){
        sapply(1:4,function(k){
            lambda[k]*density(n_c[c], y_c[c],X[c],beta,sigma,k)
        })
    })

# Denominator :

denom_tau<-list()
for (c in 1:100){
    val<-0
    for (k in 1:4){
        val<-val+num_tau[k,c][[1]]
    }
denom_tau[[c]]<-val
}

# Ratio :
for (l in 1:4){
    for (c in 1:100){
        tau[l,c]<-as.numeric(num_tau[l,c][[1]]/denom_tau[[c]])
    }
}

Solution

  • As @minem suggested, you can use the Rmpfr package. Here's one way to apply it to your case.

    First move the multipliers inside the exponential of the numerator, using the fact that a*exp(b) = exp(b + log(a)). Then re-write your density function to compute the log numerator:

    log_numerator <- function(nc, yc, X, beta, sig, k, lambda){
      v <- yc - X %*% beta[,k]
      res <- -sum(v*v)/(2*sig[k]) - (nc/2)*log(2*pi*sig[k]) + log(lambda[k])
      drop(res)
    }
    

    Note that lambda is now passed to this function. Also note that we can compute the dot product of the vector Y - X*beta more efficiently, as shown.

    Now we can generate some data. Here I fix c and just have k = 1:2.

    set.seed(1)
    n_c <- 340
    y_c <- rnorm(340)
    dat <- data.frame(fac = sample(letters[1:11], 340, replace = TRUE)
    X_c <- model.matrix(~ fac, data = dat)
    beta <- matrix(runif(22, -10, 10), 11, 2)
    sigma <- c(21.694381,  4.267277)
    lambda <- c(0.5, 0.5)
    

    Using your density function we have

    x1 <- lambda[1] *density(n_c, y_c,X_c,beta,sigma,1)
    y1 <- lambda[2] *density(n_c, y_c,X_c,beta,sigma,2)
    x1
    # [1] +exp(-1738.4)
    y1
    # [1] +exp(-1838.7)
    as.numeric(y1/sum(x1, y1))
    # [1] 2.780805e-44
    

    Using the log-numerator function we have

    p <- 40
    x <- mpfr(log_numerator(n_c, y_c,X_c,beta,sigma,1, lambda), p)
    y <- mpfr(log_numerator(n_c, y_c,X_c,beta,sigma,2, lambda), p)
    x
    # 1 'mpfr' number of precision  40   bits 
    # [1] -1738.379327798
    y
    # 1 'mpfr' number of precision  40   bits 
    # [1] -1838.67033143
    exp(y)/sum(exp(x), exp(y))
    # 1 'mpfr' number of precision  53   bits 
    # [1] 2.780805017186589e-44
    

    So certainly mpfr can be used to produce equivalent results, but without better test code it's hard to check timings.

    You could also improve efficiency by using more vectorization. E.g. we can vectorize log_numerator over k:

    log_numerator2 <- function(nc, yc, X, beta, sig, lambda){
      M <- yc - X %*% beta
      res <- -colSums(M*M)/(2*sig) - (nc/2)*log(2*pi*sig) + log(lambda)
      drop(res)
    }
    z <- log_numerator2(n_c, y_c, X_c, beta, sigma, lambda)
    z
    # [1] -1738.379 -1838.670
    

    Now suppose we have the log numerators in a c by k matrix, for illustration suppose all c have the same values as z,

    log_num <- mpfr(matrix(z, byrow = TRUE, 3, 2), p)
    

    you can compute the ratios as follows

    num <- exp(log_num)
    denom <- apply(num, 1, sum) # rowSums not implemented for mpfr
    num/denom
    # 'mpfrMatrix' of dim(.) =  (3, 2) of precision  53   bits 
    #     [,1]              [,2]                 
    # [1,] 1.000000000000000 2.780805017186589e-44
    # [2,] 1.000000000000000 2.780805017186589e-44
    # [3,] 1.000000000000000 2.780805017186589e-44