Search code examples
pythonpandaslisthierarchy

How to compare two columns which class is grater using a list of class hierarchy


I have a list of classes from greater to lower:

classes = ['A','B','C','D']

And a Data frame with two columns:

 Segmentation 2019 Segmentation  2020
         B              A
         B              A 
         A              B         
         C              C         
         B              D

How to make third column with a class value after comparison which class is greater (if equality - remain it) ?


Solution

  • You can create a dictionary from the classes list where key is the class and value is the index (which is used as rank because list is from greater to lower)

    Then you can create 2 rank columns which would contain the ranks (0 to N -- 0 being greater). Finally, compare the ranks and take the one that's greater in rank (i.e. smaller in value)

    classes = ['A','B','C','D']
    classes_dict = {val: index for index,val in enumerate(classes)}
    df['Seg 2019 Rank'] = df['Seg 2019'].map(classes_dict)
    df['Seg 2020 Rank'] = df['Seg 2020'].map(classes_dict)
    df['greater'] = df.apply(lambda x: x['Seg 2019'] if x['Seg 2019 Rank'] < x['Seg 2020 Rank'] else x['Seg 2020'] if x['Seg 2020 Rank'] < x['Seg 2019 Rank'] else "equal" , axis=1)
    

    Output:

    Seg 2019    Seg 2020    Seg 2019 Rank   Seg 2020 Rank   greater
        B   A   1   0   A
        B   A   1   0   A
        A   B   0   1   A
        C   C   2   2   equal
        B   D   1   3   B
    

    And if you add a new class (VIP), you can just add it in the list before A and it'll be treated as a greater class