Search code examples
pythonstatisticsanovamatlab-spm

Nested Anova in python with Spm1d. Can't print f statistics and p values


I'm looking for a simple solution to perform multi-factor ANOVA analysis in python. A 2-factor nested ANOVA is what I'm after, and the SPM1D python module is one way to do that, however I am having an issue.

http://www.spm1d.org/doc/Stats1D/anova.html#two-way-nested-anova

for any of the nested approach examples, there is never any F-statistic or p_values printed, nor can I find any way to print them or send them to a variable.

To go through the motions of running one of their examples, where B is nested inside A, with Y observations:

import numpy as np
from matplotlib import pyplot
import spm1d

dataset      = spm1d.data.uv1d.anova2nested.SPM1D_ANOVA2NESTED_3x3()
Y,A,B        = dataset.get_data()

#(1) Conduct ANOVA:
alpha        = 0.05
FF           = spm1d.stats.anova2nested(Y, A, B, equal_var=True)
FFi          = FF.inference(0.05)
print( FFi )

#(2) Plot results:
pyplot.close('all')
FFi.plot(plot_threshold_label=True, plot_p_values=True)
pyplot.show()

The only indication of statistical significance provided is whether the h0 hypothesis is rejected or not.

> print( FFi )

SPM{F} inference list
   design    :  ANOVA2nested
   nEffects  :  2
Effects:
   A     z=(1x101) array      df=(2, 6)    h0reject=True
   B     z=(1x101) array      df=(6, 36)   h0reject=False

In reality, that should be enough. However, in science, scientists like to think of something as more or less significant, which is actually kind of crap... significance is binary. But that's how they think about it, so I have to play along in order to get work published.

The example code produces a matplotlib plot, and this DOES have the f statistic and p_values on it!

#(2) Plot results:
pyplot.close('all')
FFi.plot(plot_threshold_label=True, plot_p_values=True)
pyplot.show()

\spm1d\examples\stats1d\ex_anova2nest.py But I can't seem to get any output which prints it.

FFi.get_p_values

and

FFi.get_f_values

produce the output:

<bound method SPMFiList.get_p_values <kabammi edit -- or get_f_values> of SPM{F} inference list
   design    :  ANOVA2nested
   nEffects  :  2
Effects:
   A     z=(1x101) array      df=(2, 6)    h0reject=True
   B     z=(1x101) array      df=(6, 36)   h0reject=False

So I don't know what to do. Clearly the FFi.plot class can access the p_values (with plot_p_values) but FFi.get_p_values cant!!? Can anyone lend a hand?

cheers, K


Solution

  • The easiest way to get the p values is to use the get_p_values method that you mention, you just need to call the method by adding () to the end.

    p = FFi.get_p_values()
    print(p)
    

    This yields:

    ([0.016584151119287904], [])
    

    To see more detailed information for each effect in 2+-way ANOVA, including p values, use print along with the individual F statistics like this:

    print( FFi[0] )
    print( FFi[1] )
    

    The first print statement will produce output like this:

    SPM{F} inference field
       SPM.effect    :   Main A
       SPM.z         :  (1x101) raw test stat field
       SPM.df        :  (2, 6)
       SPM.fwhm      :  11.79254
       SPM.resels    :  (1, 8.47993)
    Inference:
       SPM.alpha     :  0.050
       SPM.zstar     :  24.30619
       SPM.h0reject  :  True
       SPM.p_set     :  0.017
       SPM.p_cluster :  (0.017)
    

    You can retrieve the clusters' p values like this:

    p = [F.p  for  F in FFi]
    

    which gives the same result as calling get_p_values.

    Note that there are no p values in this case for FFi[1] because the test statistic fails to cross the alpha-defined threshold (see the "Main B" panel in the figure above). If you need to report p values in this case as well, one option is simply to use "p > alpha". More precise p value are available parametrically up until about p = 0.5, but larger p values than that are not very accurate using parametric methods, so if you need p values for all cases consider using the nonparametric version: spm1d.stats.nonparam.anova2nested.