Search code examples
pythonpandaspandas-datareader

How to group dates by month in python


I know I could group if I have object with special key that presents data. But I have some data as index that looks like this

This is the index

DatetimeIndex(['2000-01-03', '2000-01-04', '2000-01-05', '2000-01-06',
               '2000-01-07', '2000-01-10', '2000-01-11', '2000-01-12',
               '2000-01-13', '2000-01-14',
               ...
               '2019-12-18', '2019-12-19', '2019-12-20', '2019-12-23',
               '2019-12-24', '2019-12-25', '2019-12-26', '2019-12-27',
               '2019-12-30', '2019-12-31'],
              dtype='datetime64[ns]', name='DATE', length=5217, freq=None)
Index(['DEXUSEU'], dtype='object')

The whole table is

        DEXUSEU
    DATE    
    2000-01-03  1.0155
    2000-01-04  1.0309
    2000-01-05  1.0335
...

Ultimately I would get the highest value for some month. I was playing around with

.groupby(pd.Grouper(freq='M')).max()

But I did not manage to get the desired results.

My goal is to have maximum value for each month. I have data of 10 years of euro/usd value pair for rate for each day. The grouping will mean that in the end I will have max value for Jan of 2000th, max value for Feb of 2000..., max value for Dec of 2019.

The .groupby(usdEuro.index.month).max() will give only 12 values, I want to have 12 per indivudial year.


Solution

  • Use DataFrameGroupBy.idxmax with convert years with months to month periods and select rows by DataFrame.loc:

    df.loc[df.groupby(df.index.to_period('M'))['DEXUSEU'].idxmax()]
    

    Or if possible use Grouper:

    df.loc[df.groupby(pd.Grouper(freq='M'))['DEXUSEU'].idxmax()]