Search code examples
pandascumsum

define a not std year period for pandas cumsum


I have a long time series with datetime index. I want to cumsum yearly but I want define my years as 1 oct to 30 sept next year

ex: cum sum on 1 oct 2018 to 30 sept 2019

Thank for your help!


Solution

  • One way is to manually mask 10,11,12 as next year:

    # toy data
    s = pd.DatetimeIndex(['2017-09-01', '2017-10-01', '2017-11-01'])
    df = pd.DataFrame([0,1,2], index=s)
    
    # mask Oct, Nov, Dec
    groups = np.where(df.index.month > 9, df.index.year + 1, df.index.year)
    # array([2017, 2018, 2018], dtype=int64)
    
    df.groupby(groups).cumsum()
    

    Second option is to convert the index to fiscal year:

    groups = df.index.to_period('Q-SEP').qyear
    # Int64Index([2017, 2018, 2018], dtype='int64')
    

    Output:

                0
    2017-09-01  0
    2017-10-01  1
    2017-11-01  3