Search code examples
rgoodness-of-fit

How to properly use gofstat in R?


I'm trying to write a bit of code in R that takes sample data from Excel and identifies the most fitting distribution to the data, and then the parameters for said distribution. After a bit of Googling, I decided to try fitdistrplus for fitting the distributions, and saw that gofstat is a function that can be used to check the goodness-of-fit. I wanted to compare the GOF statistics to find the most fitting distribution in a loop.

The initial part of my code is literally just importing my sample data from Excel (I created a 1000 values, normally distributed, in Excel, and saved it as a single column in CSV format), and trying to fit it to a distribution and plot the results.

library(fitdistrplus)
testData = read.table("C:\\Users\\Havok\\Documents\\Skripsie\\Excel\\NormalTest1.csv", header=FALSE)
(func <- apply(testData, 2,  fitdist, "norm"))
gofstat(func)
for(i in 1:1000)
  plot(f[[i]])

However, whenever I try to run the code, I get the error messages

gofstat(func) Error in gofstat(func) : argument f must a 'fitdist' object or a list of 'fitdist' objects. for(i in 1:1000) + plot(f[[i]]) Error in f[[i]] : subscript out of bounds

The plots still appear despite the "subscript out of bounds" error (I think it might be due to stray negative values in the imported data), but I really want to find out what is wrong with my gofstat usage. Any ideas?

P.S. My R experience is limited to a single module we had in university, and it was pretty basic. So any advanced tricks would be appreciated.


Solution

  • I don't think you need to use apply, that makes it a rolling function.

    library(fitdistrplus)
    set.seed(1234)
    testData = rnorm(1000)
    fit <- fitdist(testData, "norm")
    
    plot(fit)
    gofstat(fit)