Search code examples
pythonpandasdataframepandas-groupby

I am stuck in writing the python code for below problem


I have below data frame.

df = pd.DataFrame({'vin':['aaa','bbb','bbb','bbb','ccc','ccc','ddd','eee','eee','fff'],'module':['NORMAL','1ST_PRIORITY','2ND_PRIORITY','HELLO','3RD_PRIORITY','2ND_PRIORITY','2ND_PRIORITY','3RD_PRIORITY','HELLO','ABS']})

I wanted to find if the vin column contains a unique value then in the Result column it should return 'YES' and if the vin column is not unique then it will check the 'module' column and return 'YES' where the module column has more priority value.

I want output like the below data frame.

df = pd.DataFrame({'vin':['aaa','bbb','bbb','bbb','ccc','ccc','ddd','eee','eee','fff'],'module':['NORMAL','1ST_PRIORITY','2ND_PRIORITY','HELLO','3RD_PRIORITY','2ND_PRIORITY','2ND_PRIORITY','3RD_PRIORITY','HELLO','ABS'],
               'Result':['YES','YES','NO','NO','NO','YES','YES','YES','NO','YES']})

Below code, I have tried and it gives the correct result but it involves too many steps.

df['count'] = df.groupby('vin').vin.transform('count')


def Check1(df):
    if (df["count"]  == 1):
        return 1

elif ((df["count"]  != 1) & (df["module"]  == '1ST_PRIORITY')):
    return 1

elif ((df["count"]  != 1) & (df["module"]  == '2ND_PRIORITY')):
    return 2

elif ((df["count"]  != 1) & (df["module"]  == '3RD_PRIORITY')):
    return 3
    
else:
    return 4

df['Sort'] = df.apply(Check1, axis=1)

df = df.sort_values(by=['vin', 'Sort'])

df.drop_duplicates(subset=['vin'], keep='first',inplace = True)

df

Solution

  • Here's the trick, you need a custom order:

    from pandas.api.types import CategoricalDtype
    
    #create your custom order
    custom_order = CategoricalDtype(
        ['Delhi','Agra','Paris','ABS','HELLO','NORMAL'], 
        ordered=True)
    
    #then attribute it to the desired column
    df['module'] = df['module'].astype(custom_order)
    
    
    df['Result'] = ((~df.sort_values('module', ascending=True).duplicated('vin'))
                        .replace({True: 'YES', False: 'NO'}))
    

    Result:

    index vin module Result
    0 aaa NORMAL YES
    1 bbb Delhi YES
    2 bbb Agra NO
    3 bbb HELLO NO
    4 ccc Paris NO
    5 ccc Agra YES
    6 ddd Agra YES
    7 eee Paris YES
    8 eee HELLO NO
    9 fff ABS YES