Search code examples
pythonpandastime-series

Monthly and Yearly growth rate in python pandas


I have the following dataframe:

import numpy as np
import pandas as pd
from scipy import stats

df =  pd.DataFrame()

df['A'] = np.linspace(1000, 2000, 120)
df['B'] = np.linspace(1000, 3000, 120)
df['C'] = np.linspace(1000, 4000, 120)
df.index = pd.date_range('1/1/2012', periods=120, freq='MS', name='Date')

I need monthly growth rate referring the corresponding value of the same month of the first year(2012) for each columns as will as yearly Maximum growth rate referring the maximum value of the first year(2012) for each column. My try is:

avg_month_yoy = df.groupby(['month','year',])['A'].apply(  lambda v: 
v.sub(v.shift(1)).div(v.shift(1)).multiply(100).mean()
).reset_index()

I got the following output:

     month  year   A
0    April  2012 NaN
1    April  2013 NaN
2    April  2014 NaN
3    April  2015 NaN
4    April  2016 NaN
5    April  2017 NaN
6    April  2018 NaN
7    April  2019 NaN
8    April  2020 NaN
9    April  2021 NaN
10  August  2012 NaN
11  August  2013 NaN
12  August  2014 NaN
13  August  2015 NaN
14  August  2016

My expected output of monthly growth rate dataframe:

date                          A                    B                                      --- year   month        
01-01-2012    (Xa@Jan2012-Xa@Jan2012)/Xa@Jan2012 (Xb@Jan2012-Xb@Jan2012)/Xb@Jan2012   ---  2012    Jan   
01-02-2012    (Xa@Feb2012-Xa@Feb2012)/Xa@Feb2012   (Xb@Feb2012-Xb@Feb2012)/Xb@Feb2012   --- 2012    Feb                         
            ...      ....     ....       ....       ....     ....
01-12-2022    (Xa@Dec2022-Xa@Dec2012)/Xa@Dec2012   (Xb@Dec2022-Xb@Dec2012)/Xb@Dec2012   ---  2012    Dec 
 

                       
                    

My expected output of yearly growth rate dataframe:

year      A                                 B                                              C        
   
2012 ([email protected]@2012)/Max.Xa@2012 ([email protected]@2012)/Max.Xb@2012    ([email protected]@2012)/Max.Xc@2012   
2013 ([email protected]@2012)/Max.Xa@2012 ([email protected]@2012)/Max.Xb@2012    ([email protected]@2012)/Max.Xc@2012
 ...     ...                       ...                     ...
2022 ([email protected]@2012)/Max.Xa@2012 ([email protected]@2012)/Max.Xb@2012    ([email protected]@2012)/Max.Xc@2012

Solution

  • Demo : https://trinket.io/python3/fa778eb76d?showInstructions=true

    import numpy as np
    import pandas as pd
    from scipy import stats
    
    df =  pd.DataFrame()
    
    df['A'] = np.linspace(1000, 2000, 120)
    df['B'] = np.linspace(1000, 3000, 120)
    df.index = pd.date_range('1/1/2012', periods=120, freq='MS', name='Date')
    
    
    # Monthly dataframe:
    # - append to index for Month
    df_m = df.copy(deep=False)
    df_m['Month'] = df.index.month
    df_m = df_m.set_index('Month', append=True)
    
    # First year's worth of monthly data:
    # - first 12 rows, indexed only by Month
    df_f = df_m.droplevel('Date', axis=0).iloc[0:12]
    
    # Divide monthly data by first year's data (implicitly joined on common index: Month)
    print((df_m / df_f - 1) * 100)
    
    
    
    # Yearly dataframe:
    # - max of each column per year
    df_y = df.groupby(df.index.year).max()
    
    # Divide yearly data by first row from yearly data
    print((df_y / df_y.iloc[0] - 1) * 100)