Search code examples
pandasnumpydataframecombinationspython-itertools

Same calculation for all combinations in DataFrame


I got dataframe like below,

import pandas as pd
df = pd.DataFrame({'CITY': ['A','B','C','A','C','B'], 
                   'MAKE_NAME': ['SO','OK','CO','LU','CO','OK'],
                   'USER' : ['JK','JK','MK','JK','JK','JK'],
                   'RESULT_CODE' : ['Y','Y','N','N','Y','Y'],
                   'VALID' : [1,1,1,1,1,0],
                   'COUNT' : [1,1,1,1,1,1] })

I want to calculate the valid/count of all combinations in double and triple and quadruple. Also i want to get result as dataframe.

For example result for double like below,

enter image description here

Also result for triple like below,

enter image description here

Thanks for all,


Solution

  • You can find the solution below.

    import pandas as pd
    df = pd.DataFrame({'CITY': ['A','B','C','A','C','B'], 
                       'MAKE_NAME': ['SO','OK','CO','LU','CO','OK'],
                       'USER' : ['JK','JK','MK','JK','JK','JK'],
                       'RESULT_CODE' : ['Y','Y','N','N','Y','Y'],
                       'VALID' : [1,1,1,1,1,0],
                       'COUNT' : [1,1,1,1,1,1] })
    for i in df.columns:
        for j in df.columns:
            for k in df.columns:
                for l in df.columns:
    
    
                    try:
                        a1 = df.groupby([i,j,k,l], as_index=False, sort=True, group_keys=True)[['VALID','COUNT']].count()
                        a1['RATE'] = a1.VALID / a1.COUNT
                        print(a1)          
                    except Exception:
                        pass