Search code examples
pythonpandasstatsmodels

Statistic values of Fleiss Kappa using statsmodels.stats.inter_rater


I use statsmodels.stats.inter_rater.fleiss_kappa to calculate my inter-rater reliability. I only get the kappa value. What if I need the z-value, p-value, and range?


Solution

  • My advice to you is to reopen your stats lecture notes and look at the formulas. This is a standard excercice that I often pass on my students:

    import numpy as np
    import pandas as pd
    from statsmodels.stats.inter_rater import fleiss_kappa
    from scipy.stats import norm
    
    np.random.seed(42)
    
    data = {
        f'Item{i+1}': np.random.choice([0, 1, 2], size=30, p=[0.33, 0.33, 0.34]) for i in range(15)
    }
    df = pd.DataFrame(data)
    
    formatted_data = {
        f"Category {cat}": [(df[item] == cat).sum() for item in df] for cat in range(3)
    }
    formatted_df = pd.DataFrame(formatted_data)
    
    kappa = fleiss_kappa(formatted_df.values)
    
    category_totals = formatted_df.sum(axis=1) 
    p = np.sum((category_totals / (30 * 15))**2)  
    
    n = 15  
    k = 3   
    N = n * 30  
    
    variance = (1 / (N * (n - 1))) * (N * p * (1 - p) + (n * (k - 1) * (p - (1 / k)**2)))
    if variance > 0:
        z_value = kappa / np.sqrt(variance)
        p_value = 2 * (1 - norm.cdf(np.abs(z_value)))
        z_critical = norm.ppf(0.975)
        margin_of_error = z_critical * np.sqrt(variance)
        lower_bound = kappa - margin_of_error
        upper_bound = kappa + margin_of_error
    
        print("Fleiss' kappa:", kappa)
        print("Z-value:", z_value)
        print("P-value:", p_value)
        print("Confidence interval (95%):", (lower_bound, upper_bound))
    else:
        print("Variance calculation error: Non-positive variance", variance)
    
    

    which gives

    Fleiss' kappa: -0.008536683290635389
    Z-value: -0.1312124600755962
    P-value: 0.8956072394628303
    Confidence interval (95%): (-0.13605194965657783, 0.11897858307530704)