I am working on Generalized Pareto Distribution in R and I want to calculate the bias, mse and rmse of the disitribtion. However when I'm using the mle function the following error occured.
Error in dgpd(x = samples, scale = scale, shape = shape, log = TRUE) :
invalid scale
The code is as following:
library(stats4)
library(evd)
gpd_simulation<-function(N,n,scale,shape){
conc<-numeric(N)
for(i in 1:N){
samples=rgpd(n,scale=scale,shape=shape)
gpdlik<-function(scale,shape){
-sum(dgpd(x=samples,scale=scale,shape=shape,log=TRUE))
}
result<-mle(minuslogl=gpdlik,start=list(scale=11.63599901,shape=0.1),lower=c(0,0),method="L-BFGS-B")
conc[i]<-result@coef
}
conc
}
summarygpd1<-function(conc){
bias=mean(conc)-3;bias
mse=var(conc)+bias^2;mse
rmse=sqrt(var(conc)+bias^2);rmse
return(c(bias,mse,rmse))
}
simulate40<-gpd_simulation(1000,40,11.63599901,0.1);simulate40
summarygpd1(simulate40)
However, if i use my own dataset, the following error would not occur, is there something wrong with using rgpd this way?
Just to add, if I increase my sample size, the codes run fine, I increase it to 80, 100 and 120, The code run with no error, but 40 and 60 the error will occur again.
When I run your code, it fails when the mle()
function tries a scale that is exactly zero. It never chooses an MLE that's near zero, so it's probably simplest to set the lower limit on the parameter to a small positive number instead of using zero, e.g. setting the lower limit to 0.01 works for me with warnings but no error.
The warnings come from the line conc[i] <- result@coef
. At this point result@coef
is length 2, and conc[i]
can only hold a single number. If you only want to save the scale, use conc[i] <- result@coef[1]
.
This code incorporates both fixes:
gpd_simulation<-function(N,n,scale,shape){
conc<-numeric(N)
for(i in 1:N){
samples=rgpd(n,scale=scale,shape=shape)
gpdlik<-function(scale,shape){
-sum(dgpd(x=samples,scale=scale,shape=shape,log=TRUE))
}
result<-mle(minuslogl=gpdlik,start=list(scale=11.63599901,shape=0.1),lower=c(0.01,0),method="L-BFGS-B")
conc[i]<-result@coef[1]
}
conc
}
A more general solution is to figure out what should happen in the limit as scale tends to zero, and have gpdlik
return that value when scale
is exactly zero.