I have a data frame X with 2 columns a and b, a is of class character and b is of class numeric. I fitted a gaussian distribution using the fitdist (fitdistrplus package) function on b.
data.fit <- fitdist(x$b,"norm", "mle")
I want to extract the elements in column a that fall in the 5% right tail of the fitted gaussian distribution.
I am not sure how to proceed because my knowledge on fitting distribution is limited.
Do I need to retain the corresponding elements in column a for which b is greater than the value obtain for the 95%?
Or does the fitting imply that new values have been created for each value in b and I should use those values?
Thanks
by calling unclass(data.fit)
you can see all the parts that make up the data.fit
object, which include:
$estimate
mean sd
0.1125554 1.2724377
which means you can access the estimated mean and standard deviation via:
data.fit$estimate['sd']
data.fit$estimate['mean']
To calculate the upper 5th percentile of the fitted distribution, you can use the qnorm()
function (q is for quantile, BTW) like so:
threshold <-
qnorm(p = 0.95,
mean=data.fit$estimate['mean'],
sd=data.fit$estimate['sd'])
and you can subset your data.frame x
like so:
x[x$b > threshold,# an indicator of the rows to return
'a']# the column to return