I am currently using R version 3.4.4 (2018-03-15) with R Studio.
I need to calculate ratio of two values. And I have problems with some case :
When the ratio is computed, I get a NaN (because of 0/0).
First solution :
I use the Brobdingnag library that allows to keep number as exponentiel, and finally obtain that the ratio actually is : exp(-3.8987) = 0.02026725
But, checking at the performance of my code with the library profvis, I can see that despite the fact that the Brobdingnag library is very useful in my case, it cost me a lot in term of performance. And I cannot keep this solution, because I have to do a lot of simulations of my algorithm.
Questions for an other solution :
Have you heard about an other library to deal with very small (or large) values ?
I would like to keep my numerator and denominator in an exponential expression until the division is made, but I have no idea of how to do it. Because of course, my numerator and denominator are vectors, that I divide once they are both calculated. (I can not obtain denominator without the numerator vector) Is there a way to "force" R to keep a value as exp instead of integer (and 0...) ?
Thank you in advance for any help.
EDIT :
Here is the ratio I have to calculate :
I am not sure I can use the trick : exp(x)/exp(y) = exp(x-y) because I have a sum into the denom. That's why I need the exp formula until I do the ratio... Value inside the exp are very large negative number, and exp of these numbers make 0. Plus, I tried to transform numerator as log, so I can have log of firt past + second part (without exp) but sometimes, first part of numerator (1/sqrt...) is to small and log of it returns Inf..
I think there is a way, but I can' find it.
Thanks for all the answers btw!
EDIT 2 :
####### Fonction that calculate the density (with brobdingnag package) :
density <- function(nc,yc,X,beta,sig,k){
# n_c is a vector of integer
# y_c is a vector of numeric
# X is a matrix
# beta is a vector of numeric
# sigma is a value
res<-as.brob((1/(2*pi*sig[k])))^(nc/2)*exp(as.brob(-(1/(2*sig[k]))*t(yc-(X %*% beta[,k])) %*% (yc-(X %*% beta[,k]))))
return(res)
}
####### Code for calculation of the ratio :
# n_c[c] : num [1] 340
# y_c[c] : num [1:340] 1.279 0.777 1.069 0.864 1.56 ...
# X[c] : num [1:340, 1:11] 1 1 1 1 1 1 1 1 1 1 ... (matrix of 0 and 1)
# beta : num [1:11, 1:2] 1.542 -0.226 -0.145 -0.438 -0.201 ...
# sigma : num [1:2] 21.694381 4.267277
# lambda : num [1] 0.5
# Numerator :
num_tau<-sapply(1:100,function(c){
sapply(1:4,function(k){
lambda[k]*density(n_c[c], y_c[c],X[c],beta,sigma,k)
})
})
# Denominator :
denom_tau<-list()
for (c in 1:100){
val<-0
for (k in 1:4){
val<-val+num_tau[k,c][[1]]
}
denom_tau[[c]]<-val
}
# Ratio :
for (l in 1:4){
for (c in 1:100){
tau[l,c]<-as.numeric(num_tau[l,c][[1]]/denom_tau[[c]])
}
}
As @minem suggested, you can use the Rmpfr package. Here's one way to apply it to your case.
First move the multipliers inside the exponential of the numerator, using the fact that a*exp(b) = exp(b + log(a)). Then re-write your density
function to compute the log numerator:
log_numerator <- function(nc, yc, X, beta, sig, k, lambda){
v <- yc - X %*% beta[,k]
res <- -sum(v*v)/(2*sig[k]) - (nc/2)*log(2*pi*sig[k]) + log(lambda[k])
drop(res)
}
Note that lambda
is now passed to this function. Also note that we can compute the dot product of the vector Y - X*beta more efficiently, as shown.
Now we can generate some data. Here I fix c and just have k = 1:2.
set.seed(1)
n_c <- 340
y_c <- rnorm(340)
dat <- data.frame(fac = sample(letters[1:11], 340, replace = TRUE)
X_c <- model.matrix(~ fac, data = dat)
beta <- matrix(runif(22, -10, 10), 11, 2)
sigma <- c(21.694381, 4.267277)
lambda <- c(0.5, 0.5)
Using your density function we have
x1 <- lambda[1] *density(n_c, y_c,X_c,beta,sigma,1)
y1 <- lambda[2] *density(n_c, y_c,X_c,beta,sigma,2)
x1
# [1] +exp(-1738.4)
y1
# [1] +exp(-1838.7)
as.numeric(y1/sum(x1, y1))
# [1] 2.780805e-44
Using the log-numerator function we have
p <- 40
x <- mpfr(log_numerator(n_c, y_c,X_c,beta,sigma,1, lambda), p)
y <- mpfr(log_numerator(n_c, y_c,X_c,beta,sigma,2, lambda), p)
x
# 1 'mpfr' number of precision 40 bits
# [1] -1738.379327798
y
# 1 'mpfr' number of precision 40 bits
# [1] -1838.67033143
exp(y)/sum(exp(x), exp(y))
# 1 'mpfr' number of precision 53 bits
# [1] 2.780805017186589e-44
So certainly mpfr
can be used to produce equivalent results, but without better test code it's hard to check timings.
You could also improve efficiency by using more vectorization. E.g. we can vectorize log_numerator
over k:
log_numerator2 <- function(nc, yc, X, beta, sig, lambda){
M <- yc - X %*% beta
res <- -colSums(M*M)/(2*sig) - (nc/2)*log(2*pi*sig) + log(lambda)
drop(res)
}
z <- log_numerator2(n_c, y_c, X_c, beta, sigma, lambda)
z
# [1] -1738.379 -1838.670
Now suppose we have the log numerators in a c by k matrix, for illustration suppose all c have the same values as z
,
log_num <- mpfr(matrix(z, byrow = TRUE, 3, 2), p)
you can compute the ratios as follows
num <- exp(log_num)
denom <- apply(num, 1, sum) # rowSums not implemented for mpfr
num/denom
# 'mpfrMatrix' of dim(.) = (3, 2) of precision 53 bits
# [,1] [,2]
# [1,] 1.000000000000000 2.780805017186589e-44
# [2,] 1.000000000000000 2.780805017186589e-44
# [3,] 1.000000000000000 2.780805017186589e-44