Search code examples
pythonpandasresamplingargmax

Pandas: Resample dataframe column, get discrete feature that corresponds to max value


Sample data:

import pandas as pd
import numpy as np
import datetime

data = {'value': [1,2,4,3], 'names': ['joe', 'bob', 'joe', 'bob']}
start, end = datetime.datetime(2015, 1, 1), datetime.datetime(2015, 1, 4)
test = pd.DataFrame(data=data, index=pd.DatetimeIndex(start=start, end=end, 
       freq="D"), columns=["value", "names"])

gives:

          value names
2015-01-01  1   joe
2015-01-02  2   bob
2015-01-03  4   joe
2015-01-04  3   bob

I want to resample by '2D' and get the max value, something like:

df.resample('2D')

The expected result should be:

          value names
 2015-01-01 2   bob
 2015-01-03 4   joe

Can anyone help me?


Solution

  • You can resample to get the arg max of value and then use it to extract names and value

    (df.resample('2D')[['value']].idxmax()
       .assign(names=lambda x: df.loc[x.value]['names'].values,
               value=lambda x: df.loc[x.value]['value'].values)
    )
    Out[116]: 
                value names
    2015-01-01      2   bob
    2015-01-03      4   joe