Search code examples
pythonpandastuplesstatsmodelspython-zip

How to iteratively add tuple elements to a dataframe as new columns?


I am using the statsmodels.stats.multitest.multipletests function

to correct p-values I have stored in a dataframe:

p_value_df = pd.DataFrame({"id": [123456, 456789], "p-value": [0.098, 0.05]})

for _, row in p_value_df.iterrows():
    p_value = row["p-value"]
    print(p_value)
    results = multi.multipletests(
        p_value,
        alpha=0.05,
        method="bonferroni",
        maxiter=1,
        is_sorted=False,
        returnsorted=False,
    )
    print(results)

which looks like: enter image description here

I would really like to add each of the elements of the tuple output as a new column in the p_value_df and am a bit stuck.

I've attempted to convert the results to a list and use zip(*tuples_converted_to_list) but as some of the values are floats this throws an error.

Additionally, I'd like to pull the array elements so that array([False]) is just False.

Can anyone make any recommendations on a strategy to do this?


Solution

  • I would use a listcomp to make a nested list of the multitests, then pass it to the DataFrame constructor and finally join it with the original p_value_df :

    import numpy as np
    import statsmodels.stats.multitest as multi
    
    def fn(pval):
        return multi.multipletests(
            pval, alpha=0.05, method="bonferroni",
            maxiter=1, is_sorted=False, returnsorted=False,
        )
    
    l = [
        [e[0] if isinstance(e, np.ndarray) and e.size == 1 else e
         for e in fn(pval)] for pval in p_value_df["p-value"]
    ]
    
    
    cols = ["reject", "pvals_corrected", "alphacSidak", "alphacBonf"]
    
    out = p_value_df.join(pd.DataFrame(l, columns=cols))
    

    Output :

    print(out)
    
           id  p-value  reject  pvals_corrected  alphacSidak  alphacBonf
    0  123456    0.098   False            0.098         0.05        0.05
    1  456789    0.050    True            0.050         0.05        0.05