I've below pandas dataframe
Name | Value1 | Value2
User 1 | 1 | 15
User 1 | 2 | 35
User 1 | 3 | 35
User 1 | 3 | 35
User 2 | 4 | 23
User 2 | 5 | 25
User 3 | 6 | 45
User 3 | 7 | 50
I want to select only those rows which have largest value in Value2 column using pandas dataframe
Like this:
Name | Value1 | Value2
User 1 | 2 | 35
User 1 | 3 | 35
User 1 | 3 | 35
User 2 | 5 | 25
User 3 | 7 | 50
Code with description will help me more as pandas are new for me.
Thanks in Advance :)
Use GroupBy.transform
with max
for Series filled by max values per groups, compare by column Value2
with Series.eq
and filter by boolean indexing
:
df1 = df[df.groupby('Name')['Value2'].transform('max').eq(df['Value2'])]
print (df1)
Name Value1 Value2
1 User 1 2 35
2 User 1 3 35
3 User 1 3 35
5 User 2 5 25
7 User 3 7 50
Details:
print (df.groupby('Name')['Value2'].transform('max'))
0 35
1 35
2 35
3 35
4 25
5 25
6 50
7 50
Name: Value2, dtype: int64
print (df.groupby('Name')['Value2'].transform('max').eq(df['Value2']))
0 False
1 True
2 True
3 True
4 False
5 True
6 False
7 True
Name: Value2, dtype: bool