I wish to find out the largest 5 numbers in a DataFrame and store the Index name and Column name for these 5 values.
I am trying to use nlargest() and idxmax methods but failing to achieve what i want. My code is as below:
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
df = DataFrame({'a': [1, 10, 8, 11, -1],'b': [1.0, 2.0, 6, 3.0, 4.0],'c': [1.0, 2.0, 6, 3.0, 4.0]})
Can you kindly let me know How can i achieve this. Thank you
max_vals = df.stack().nlargest(5)
This will give you a Series with a multiindex, where the first level is the original DataFrame's index, and the second level is the column name for the given value. Here's what max_vals
looks like:
3 a 11.0
1 a 10.0
2 a 8.0
b 6.0
c 6.0
To explicitly get the index and column names, use get_level_values
on the index of max_vals
:
max_idx = max_vals.index.get_level_values(0)
max_cols = max_vals.index.get_level_values(1)
The result of max_idx
:
Int64Index([3, 1, 2, 2, 2], dtype='int64')
The result of max_cols
:
Index(['a', 'a', 'a', 'b', 'c'], dtype='object')