I have the following data
dat<-c(16.254884, 14.077510, 12.851675, 19.152597, 11.511230,
16.122911, 16.099962, 9.670949, 12.523661, 15.257432, 13.603848,
14.118873, 12.632340, 15.413753, 5.426383, 11.369880, 12.895920,
13.635134, 15.118388,13.154107, 8.913164, 17.302810, 14.968054,
16.200151, 16.068944, 18.571952, 15.247535, 15.018281)
I am using this code to find the mode:
Mode_fc <- function(x) {
ux <- unique(x)
ux[which.max(tabulate(match(x, ux)))]
}
Using MyParam
, I am able to get the min, max and mode
MyParam <- c(min= min(dat), max= max(dat), mode= Mode_fc(dat))
When I enter these values into the code below fitdist
works as expected
fitdist(dat, "triang", start = list(min=5.4, max=19.2, mode=16.3))
But, when I try to read in MyParam I get all sorts of errors
fitdist(dat, "triang",
start = list(min=MyParam[[1]], max=MyParam[[2]], mode=MyParam[[3]]))
I know the issue is with optim()
, but I have not been able to figure out how to fix this problem. Any suggestions are appreciated!.
Your problem (admittedly fairly subtle) is that the likelihood of data under the triangular distribution is zero (and hence the log-likelihood is negative-infinite) if any of the data are outside, or on the boundaries of, the distribution. Illustration:
library(fitdistrplus)
library(mc2d) ## needed for dtriang
Trying the fits (as in your example above):
L1 <- list(min=5.4, max=19.2, mode=16.3)
fitdist(dat, "triang", start = L1) ## works
L2 <- list(min=MyParam[[1]], max=MyParam[[2]], mode=MyParam[[3]])
fitdist(dat, "triang", start = L2) ## fails
Let's break this down a bit and see what the actual log-likelihoods are for each set of parameters:
do.call(dtriang,c(list(x=dat,log=TRUE),L1))
## [1] -1.935669 -2.159550 -2.311845 -6.045302 -2.510156 -1.947902 -1.950044
## [8] -2.868448 -2.356862 -2.032059 -2.215681 -2.154794 -2.341722 -2.016325
## [15] -7.955320 -2.533557 -2.305925 -2.211875 -2.046264 -2.272062 -3.063767
## [22] -2.355858 -2.061854 -1.940724 -1.952947 -3.461371 -2.033063 -2.056619
All finite values.
(test2 <- do.call(dtriang,c(list(x=dat,log=TRUE),L2)))
## [1] -1.926160 -2.150652 -2.303450 -Inf -2.502540 -1.938423 -1.940570
## [8] -2.862702 -2.348631 -2.022796 -2.206960 -2.145882 -2.333434 -2.007021
## [15] -Inf -2.526044 -2.297509 -2.203141 -2.037041 -2.263528 -3.059363
## [22] -2.375012 -2.052673 -1.931228 -1.943481 -3.533698 -2.023803 -2.047423
Two infinite values, which correspond to the min and max values.
which(!is.finite(test2)) ## 4 15
which.min(test2) ## 4
which.max(test2) ## 5
We can easily get around this by tweaking the minimum down a bit and the maximum up a bit from the observed values:
eps <- 0.100
L3 <- list(min=MyParam[[1]]-eps, max=MyParam[[2]]+eps, mode=MyParam[[3]])
fitdist(dat, "triang", start = L3)
This works fine.