I'm trying to write a bit of code in R that takes sample data from Excel and identifies the most fitting distribution to the data, and then the parameters for said distribution. After a bit of Googling, I decided to try fitdistrplus for fitting the distributions, and saw that gofstat is a function that can be used to check the goodness-of-fit. I wanted to compare the GOF statistics to find the most fitting distribution in a loop.
The initial part of my code is literally just importing my sample data from Excel (I created a 1000 values, normally distributed, in Excel, and saved it as a single column in CSV format), and trying to fit it to a distribution and plot the results.
library(fitdistrplus)
testData = read.table("C:\\Users\\Havok\\Documents\\Skripsie\\Excel\\NormalTest1.csv", header=FALSE)
(func <- apply(testData, 2, fitdist, "norm"))
gofstat(func)
for(i in 1:1000)
plot(f[[i]])
However, whenever I try to run the code, I get the error messages
gofstat(func) Error in gofstat(func) : argument f must a 'fitdist' object or a list of 'fitdist' objects. for(i in 1:1000) + plot(f[[i]]) Error in f[[i]] : subscript out of bounds
The plots still appear despite the "subscript out of bounds" error (I think it might be due to stray negative values in the imported data), but I really want to find out what is wrong with my gofstat usage. Any ideas?
P.S. My R experience is limited to a single module we had in university, and it was pretty basic. So any advanced tricks would be appreciated.
I don't think you need to use apply, that makes it a rolling function.
library(fitdistrplus)
set.seed(1234)
testData = rnorm(1000)
fit <- fitdist(testData, "norm")
plot(fit)
gofstat(fit)