Search code examples
pythonmatlabfor-loopanovat-test

one way anova for multiple classes either matlab or python


i have 4 classes of size (72,22,22) and i want to do anova1 for each pair of the (22,22) in those classes . for example; i wanna grab, let's say, pair (2,3) in each class and do the anova for each pair in the 4 arrays, so the output would be an array of (22,22)representing P-values of each pair across 4 classes. hopefullay my code explains what i am trying to say :) i have tried this piece of code but im not sure cuz the results are kinda odd; i guess the issue is in 2 for loops. here is the code here is my code :



x = load("plv_8_12.mat");

x = x.plv;

size(x)

ans =
        2592          22          22

S1C1 =x(1:72,:,:);

S1C2 = x(649:720,:,:);

S1C3 = x(1297:1368,:,:);

S1C4 = x(1945:2016,:,:);

p_all = [];


for x =1:22

    for y=1:22
        
        tc1 = S1C1(:,x,y);
        tc2 = S1C2(:,x,y);
        tc3 = S1C3(:,x,y);
        tc4 = S1C4(:,x,y);
        temp = [tc1 tc2 tc3 tc4];
        
        %p_all(x,y) = anova1(temp);
        [p,tbl,stats]=anova1(temp);
        
        close all
    end
end

Solution

  • Since the observations are structured in an array, there is a strong suspicion that they are related between each element (z, x, y) in each "class", i.e. paired. If it is the case, the regular ANOVA is a non-valid test, and one may need the repeated measures ANOVA (much slower in the code below). For the regular ANOVA, all observations need to be independent from the other group. I don't have any mat files, so I can't verify the first part of the code. If it doesn't work, please let me know.

    import numpy as np
    import pandas as pd
    import scipy.io as sio
    from scipy.stats import f_oneway
    from statsmodels.stats.anova import AnovaRM
    
    mat_contents = sio.loadmat("plv_8_12.mat")
    data = sorted(mat_contents.keys())[-1]
    data = mat_contents[data]
    print(data.shape)
    
    #data = np.random.normal(loc=0, scale=1, size=(2016, 22, 22))
    
    S1C1 = data[0:72,:,:]
    S1C2 = data[648:720,:,:]
    S1C3 = data[1296:1368,:,:]
    S1C4 = data[1944:2016,:,:]
    
    p_all_paired = np.zeros((22,22))
    p_all_indipendent = np.zeros((22,22))
    
    for x in range(data.shape[1]):
        for y in range(data.shape[1]):
            df = pd.DataFrame({'ID': np.arange(72), 'S1C1': S1C1[:,x,y], 'S1C2': S1C2[:,x,y], 'S1C3': S1C3[:,x,y], 'S1C4': S1C4[:,x,y]}).melt(id_vars='ID', var_name='Class', value_name='Measurement')
            p_all_indipendent[x,y] = f_oneway(S1C1[:,x,y], S1C2[:,x,y], S1C3[:,x,y], S1C4[:,x,y])[1]
            p_all_paired[x,y] = AnovaRM(df, 'Measurement', 'ID', ['Class']).fit().anova_table.loc['Class','Pr > F']
            
    print("Unpaired p-values")
    print(p_all_indipendent)
    print("\n\nPaired p-values")
    print(p_all_paired)