Problem with the output of chisq.test with R

I'm actually creating a package for the Benford Law (for academic purpose). And I'm trying to perform a goodness of fit with the "chisq.test".

I've this vector :

prop = [1377 803 477 381 325 261 253 224 184]

That I want to compare with this vector of probabilities (1st digit from Benford Law) :

th = [0.301 0.176 0.125 0.097 0.079 0.067 0.058 0.051 0.046]

Thus, I perfom the test :

chisq.test(prop,p=th)

Then, if I understood the purpose of the test correctly, it should return a big p-value (close to 1 rather than 0) because proportions from the data (prop) is really similar to the theoric proportions (th) , but the output give me :

"Chi-squared test for given probabilities data: prop X-squared = 22.044, df = 8, p-value = 0.004835"

Thus, if someone can help me to understand it gave this low p-value ?

Thanks a lot

PS :

I performed "chisq.benftest" (Pearson's Chi-squared Goodness-of-Fit Test for Benford's Law) with the same data and it gave me a more coherent p-value (0.7542), thus I should have a done a mistake somewhere, but I don't know where.

Solution

I think the low p-value is because you have a good number of measurements, and the data just doesn't fit the theoretical expectation well enough.

If you had fewer measurements there would be more uncertainty and you would get higher p-values.

chisq.test(prop/2, p=th)   # p-value = 0.1916
chisq.test(prop/3, p=th)   # p-value = 0.4884
chisq.test(prop/4, p=th)   # p-value = 0.6929
chisq.test(prop/5, p=th)   # p-value = 0.8121

To see where the algorithm is finding most discrepancy, you can plot the chi-gram like this:

barplot(prop - (sum(prop) * th)) / sqrt(sum(prop) * th)

This is a plain R example, checking goodness of fit against an equal distribution:

a <- c(11, 9)
t <- c(0.5, 0.5)
chisq.test(a,p=t)

That gives p-value 0.6547 because it's a rather small number of measurements, and only 1 degree of freedom.

But if you run the same test, with the same proportions, with a larger and larger number of observations, the p-value keeps falling:

chisq.test(a*3,p=t)      # p-value = 0.4386
chisq.test(a*10,p=t)     # p-value = 0.1573
chisq.test(a*20,p=t)     # p-value = 0.0455

Your original data really does look very close to the theory when you plot it. But there are many degrees of freedom and you have a lot of observations.

The same principle does apply to other inferential statistics. More observations means the algorithms become more certain about how well the sample represents the population.