python pandas datetime group-by intervals

Groupby one year interval with the start as first datapoint of the series

How can I group a time series with 1 year intervals such that the start of the first interval is the first datapoint and the new series is labeled by that starting point?

E.g. here I have a series that starts at 2000-01-11, so the first interval should have all datapoints between 2000-01-11 and 2001-01-10, second 2001-01-11 and 2002-01-10 etc; the labels of the new series 2000-01-11, 2001-01-11 etc?

import pandas as pd
import numpy as np

i = pd.date_range('2000-01-11', '2022-02-10', freq='D')
t = pd.Series(index=i, data=np.random.randint(0,100,len(i)))
print(t)

t.groupby(pd.Grouper(freq='1Y', origin='start', label='left')).mean()

This codes seems to bin at the start of the year and label by the end of the year.

Solution

IIUC, you can use pd.cut and group by these categories:

x = pd.cut(
    i,
    pd.date_range(start="1999-12-31", end="2022-02-10", freq="12M")
    + pd.offsets.DateOffset(11),
    right=False,
    include_lowest=True
)

out = t.groupby(x).mean()
print(out)

Prints:

[2000-01-11, 2001-01-11)    51.174863
[2001-01-11, 2002-01-11)    48.197260
[2002-01-11, 2003-01-11)    49.400000
[2003-01-11, 2004-01-11)    50.509589
[2004-01-11, 2005-01-11)    49.680328
[2005-01-11, 2006-01-11)    48.334247
[2006-01-11, 2007-01-11)    47.882192
[2007-01-11, 2008-01-11)    51.405479
[2008-01-11, 2009-01-11)    50.437158
[2009-01-11, 2010-01-11)    49.520548
[2010-01-11, 2011-01-11)    48.591781
[2011-01-11, 2012-01-11)    51.643836
[2012-01-11, 2013-01-11)    51.084699
[2013-01-11, 2014-01-11)    50.334247
[2014-01-11, 2015-01-11)    51.109589
[2015-01-11, 2016-01-11)    48.230137
[2016-01-11, 2017-01-11)    49.691257
[2017-01-11, 2018-01-11)    47.326027
[2018-01-11, 2019-01-11)    48.728767
[2019-01-11, 2020-01-11)    47.947945
[2020-01-11, 2021-01-11)    48.866120
[2021-01-11, 2022-01-11)    49.268493
dtype: float64