Search code examples
pythonmatlabscipystatistical-test

What is python's equivalent of Matlab's ranksum?


The scipy.stats.ranksums, wilcoxon, mannwhitneyu all give different results from Matlab's ranksum.


Solution

  • It depends which options you are using. Check the following example:

    1- MATLAB

    rng('default') % for reproducibility
    x = unifrnd(0,1,20,1);
    y = unifrnd(0.25,1.25,20,1);
    p = ranksum(x,y);
    p =
       0.2503
    

    2- Python (to be consistent, I did not generate numbers again, but I have moved the generated numbers in MATLAB to Python script)

    from scipy.stats import *
    import numpy as np
    
    y=np.array([0.905740699156587, 0.285711678574190, 1.09912930586878, 1.18399324775755, 0.928735154857774, 1.00774013057833, 0.993132468124916, 0.642227019534168, 0.905477890177557, 0.421186687811562, 0.956046088019609, 0.28183284637742, 0.526922984960890, 0.296171390631154, 0.347131781235848, 1.07345782832729, 0.944828622975817, 0.567099480060861, 1.20022204883835, 0.284446080502909])
    x=np.array([0.814723686393179,  0.905791937075619,  0.126986816293506,  0.913375856139019,  0.632359246225410,  0.0975404049994095, 0.278498218867048,  0.546881519204984,  0.957506835434298,  0.964888535199277,  0.157613081677548,  0.970592781760616,  0.957166948242946,  0.485375648722841,  0.800280468888800,  0.141886338627215,  0.421761282626275,  0.915735525189067,  0.792207329559554,  0.959492426392903])
    
    p = ranksums(x,y)
    print p
    
    RanksumsResult(statistic=-1.1631538287209875, pvalue=0.24476709560795806)
    

    This result is with the following options:

    1- for MATLAB:

    p = ranksum(x,y) returns the p-value of a two-sided Wilcoxon rank sum test. ranksum tests the null hypothesis that data in x and y are samples from continuous distributions with equal medians, against the alternative that they are not. The test assumes that the two samples are independent. x and y can have different lengths. This test is equivalent to a Mann-Whitney U-test.

    2- for Python:

    Compute the Wilcoxon rank-sum statistic for two samples. The Wilcoxon rank-sum test tests the null hypothesis that two sets of measurements are drawn from the same distribution. The alternative hypothesis is that values in one sample are more likely to be larger than the values in the other sample. This test should be used to compare two samples from continuous distributions. It does not handle ties between measurements in x and y.


    Another Example

    Here I am using the same data, with the same function of MATLAB but different options. Now you can see that the result is equal to the result from mannwhitneyu function in Scipy.

    MATLAB

    [p,h,stats] = ranksum(x,y,'alpha',0.01,'tail','left','method','exact');
    p = 
        0.1267
    

    Python

    m = mannwhitneyu(xx, yy, use_continuity=True)
    print m
    
    MannwhitneyuResult(statistic=157.0, pvalue=0.12514839875175593)