Search code examples
pandasmax

to get n maximum numbers from python dataframe column


# to get 3 largest numbers from column duration with all duplicates
import pandas as pd
technologies =  "A B C D E F G H I J".split()
duration     = [88,70,50,87,77,88,77,88,1,87]
columns      = ['technologies','duration']
df = pd.DataFrame(list(zip(technologies,duration)), columns=columns)

print(df.nlargest(3, 'duration',keep = "all"))

output:

      technologies  duration
0            A        88
5            F        88
7            H        88

Desired output:

      technologies  duration
0            A        88
5            F        88
7            H        88
3            D        87
9            J        87
4            E        77
6            G        77

Solution

  • You can use rank to get all values which match the 3 largest durations:

    out = (df[df['duration'].rank(method='dense', ascending=False) <= 3]
        .sort_values('duration', ascending=False)
    )
    

    Output:

      technologies  duration
    0            A        88
    5            F        88
    7            H        88
    3            D        87
    9            J        87
    4            E        77
    6            G        77