Comparing Power Law with other Distributions

I'm using Jeff Alstott's Python powerlaw package to try fitting my data to a Power Law. Jeff's package is based on the paper by Clauset et al which discusses the Powerlaw.

First, some details on my data:

It is discrete (word count data);
It is heavily skewed to the left (high skewness)
It is Leptokurtic (excess kurtosis is greater than 10)

What I have done so far

df_data is my Dataframe, where word_count is a Series containing word count data for around 1000 word tokens.

First I've generated a fit object:

fit = powerlaw.Fit(data=df_data.word_count, discrete=True)

Next, I compare the powerlaw distribution for my data against other distributions - namely, lognormal, exponential, lognormal_positive, stretched_exponential and truncated_powerlaw, with the fit.distribution_compare(distribution_one, distribution_two) method.

As a result of the distribution_compare method, I've obtained the following (r,p) tuples for each of the comparisons:

fit.distribution_compare('power_law', 'lognormal') = (0.35617607052907196, 0.5346696007)
fit.distribution_compare('power_law', 'exponential') = (397.3832646921206, 5.3999952097178692e-06)
fit.distribution_compare('power_law', 'lognormal_positive') = (27.82736434863289, 4.2257378698322223e-07)
fit.distribution_compare('power_law', 'stretched_exponential') = (1.37624682020371, 0.2974292837452046)
fit.distribution_compare('power_law', 'truncated_power_law') =(-0.0038373682383605, 0.83159372694621)

From the powerlaw documentation:

R : float

The loglikelihood ratio of the two sets of likelihoods. If positive, the first set of likelihoods is more likely (and so the probability distribution that produced them is a better fit to the data). If negative, the reverse is true.

p : float

The significance of the sign of R. If below a critical value (typically .05) the sign of R is taken to be significant. If above the critical value the sign of R is taken to be due to statistical fluctuations.

From the comparison results between powerlaw, exponential and lognormal distributions, I feel inclined to say that I have a powerlaw distribution.

Would this be a correct interpretation/assumption about the test results? Or perhaps I'm missing something?

Solution

First off, while the methods might have been developed by me, Cosma Shalizi, and Mark Newman, our implementation is in Matlab and R. The python implementation I think you're using could be from Jeff Alstott or Javier del Molino Matamala or maybe Joel Ornstein (all of these are available off my website).

Now, about the results. A likelihood ratio test (LRT) does not allow you to conclude that you do or do not have a power-law distribution. It's only a model comparison tool, meaning it evaluates whether the power law is a less terrible fit to your data than some alternative. (I phrase it that way because an LRT is not a goodness of fit method.) Hence, even if the power-law distribution is favored over all the alternatives, it doesn't mean your data are power-law distributed. It only means that the power-law model is a less terrible statistical model of the data than the alternatives are.

To evaluate whether the power-law distribution itself is a statistically plausible model, you should compute the p-value for the fitted power-law model, using the semi-parametric bootstrap we describe in our paper. If p>0.1, and the power-law model is favored over the alternatives by the LRT, then you can conclude relatively strong support for your data following a power-law distribution.

Back to your specific results: each of your LRT comparisons produces a pair (r,p), where r is the normalized log likelihood ratio and p is the statistical significance of that ratio. The thing that is being tested for the p-value here is whether the sign of r is meaningful. If p<0.05 for a LRT, then a positive sign indicates the power-law model is favored. Looking at your results, I see that the exponential and lognormal_positive alternatives are worse fits to the data than the power-law model. However, the lognormal, stretched_exponential, and truncated_power_law are not, meaning these alternatives are just as terrible fits to the data as your power-law model.

Without the p-value from the hypothesis test for the power-law model itself, the LRT results are not fully interpretable. But even a partial interpretation is not consistent with a strong degree of evidence for a power-law pattern, since two non-power-law models are just as good (bad) as the power law for these data. The fact that the exponential model is genuinely worse than the power law is not surprising considering how right-skewed your data are, so nothing to write home about there.