Search code examples
pythonpandaspython-xarray

Pandas dataframe: group across years


In Pandas, is there a groupby operation to group values across multiple years, when the rest of the timestamp is the same?

For example 12:00:00 01/01/2000, 12:00:00 01/01/2001 and 12:00:00 01/01/2002 would form a group, as would 15:00:00 01/01/2000, 15:00:00 01/01/2001 and 15:00:00 01/01/2002... etc.

I can sort of achieve this with:

group = pd.groupby(timeseries, by=[timeseries.index.minute, timeseries.index.hour, timeseries.index.day, timeseries.index.month])

but it is really ugly and not flexible to the input time format. What I really want is a way of excluding the year from the groupby, but including everything else.


Solution

  • You can set some constant year and then groupby by index:

    timeseries.index = timeseries.index.map(lambda t: t.replace(year=2010))
    print (timeseries)
    group = timeseries.groupby(level=0).sum()
    print (group)