Search code examples
pythonpandasdata-manipulation

Pick the highest group/category for each person - python


I have a dataframe with three columns, Name, group1 and group2. The 'Name' column shows the different people/cases and both the 'group' columns shows the category these people belong too. Below is an image of how this data set looks:

enter image description here

As we can see from the above data set, the same person can be assigned to multiple groups and I need to pick the highest group they belong too. 01_high being the highest group and 03_low being the lowest group.

As an example, lets take the first case 'Tom', in group1 he belongs to '01_high' and for group 2 'Tom' belongs to '03_low'. I need to create a third group column 'group3' with the higher category. In this case the value in the group3 column for 'Tom' will be '01_high'.

Code to create the data set:

data = {'Name': ['Tom', 'Nick','Jack', 'Ann'],
        'group1': ['01_high', '02_medium', '03_low' , '02_medium'],
        'group2':['03_low', '03_low', '02_medium', '03_low']}
  
df = pd.DataFrame(data)
df

Final desired output:

enter image description here

I'm fairly new to python and not sure how to achieve the desired output so any help is greatly appreciated. Thanks


Solution

  • Here is one option :

    df["group3"] = df.filter(like="group").apply(sorted, axis=1).str[0]
    

    Output :

    print(df)
    
       Name     group1     group2     group3
    0   Tom    01_high     03_low    01_high
    1  Nick  02_medium     03_low  02_medium
    2  Jack     03_low  02_medium  02_medium
    3   Ann  02_medium     03_low  02_medium