I have a dataframe with 3 columns: I want to make a rule that if for a same city and same id, pick the maximum value and drop the row with lower value.
eg:
City | ID | Value |
---|---|---|
London | 1 | 12.45 |
Amsterdam | 1 | 14.56 |
Paris | 1 | 16.89 |
New York | 1 | 23.86 |
Chicago | 1 | 14.56 |
Chicago | 1 | 20.76 |
Expected Output : Select highest value for same city and ID. Here Chicago has 2 entries with same ID, I want to select the row with highest value.
Expected Output
City | ID | Value |
---|---|---|
London | 1 | 12.45 |
Amsterdam | 1 | 14.56 |
Paris | 1 | 16.89 |
New York | 1 | 23.86 |
Chicago | 1 | 20.76 |
Sort the dataframe by Value in descending order and drop the duplicates:
df = df.sort_values(by=['Value'], ascending=False).drop_duplicates(['City', 'ID'], keep='first')
print(df)
Prints:
City ID Value
3 New York 1 23.86
5 Chicago 1 20.76
2 Paris 1 16.89
1 Amsterdam 1 14.56
0 London 1 12.45