I have this code:
import pandas as pd
df = pd.DataFrame({'consumption': [10.51, 103.11, 55.48], 'co2_emissions': [37.2, 19.66, 1712]}, index=['Pork', 'Wheat Products', 'Beef'])
df['Max'] = df.idxmax(axis=1, skipna=True, numeric_only=True)
df
I need to find the n largest values. Here there is a technique using apply/lambda. But it returns error.
df.apply(lambda s: s.abs().nlargest(2).index.tolist(), axis=1,skipna=True, numeric_only=True)
TypeError: () got an unexpected keyword argument 'numeric_only'
Is there any way to obtain top N results using idxmax? Is there any way to overcome this error got when using apply lambda method?
Your error is due to passing the skipna
and numeric_only
parameters to apply
.
You can fix it with:
(df.select_dtypes('number')
.apply(lambda s: s.dropna().abs().nlargest(2)
.index.tolist(), axis=1)
)
Output:
Pork [co2_emissions, consumption]
Wheat Products [consumption, co2_emissions]
Beef [co2_emissions, consumption]
dtype: object
A more efficient approach using numpy
N = 2
tmp = df.select_dtypes('number')
out = pd.Series(
np.take_along_axis(
tmp.columns.to_numpy()[:, None],
np.argpartition(tmp, -N)[:, -N:],
axis=0
)[:, ::-1].tolist(),
index=df.index,
)