Search code examples
pythonpandasfinancequandl

How to calculate the yearly cumulative percent change


So I have been studying the SP500 yearly returns with information downloaded from my quandl subscription. I have used resample() and pct_change() to study the data but my results are not coming as to what is expected for some reason.

sp500_df = quandl.get("MULTPL/SP500_REAL_PRICE_MONTH", authtoken="YOUR OWN AUTH KEY")
sp500_Y_ret_df = sp500_df['Value'].resample('Y').mean().pct_change().dropna()

The expected value for the SP 500 return for year ending 2008 should be -38.5% but my code is showing -17% for some reason? If for some reason you cannot access the data I can provide a .csv file for the data. Thanks a million for the help.

sp500_Y_ret_df.loc['2008-12-31']

output:

-0.17319465450687388

last 20 years:

sp500_Y_ret_df.tail(20)

output:

2001-12-31   -0.164631
2002-12-31   -0.164795
2003-12-31   -0.032081
2004-12-31    0.173145
2005-12-31    0.067678
2006-12-31    0.085836
2007-12-31    0.126625
2008-12-31   -0.173195
2009-12-31   -0.224552
2010-12-31    0.203406
2011-12-31    0.113738
2012-12-31    0.087221
2013-12-31    0.190603
2014-12-31    0.175436
2015-12-31    0.067610
2016-12-31    0.014868
2017-12-31    0.170363
2018-12-31    0.121093
2019-12-31    0.065247
2020-12-31    0.061747
Freq: A-DEC, Name: Value, dtype: float64

USING Random made data:

aapl_df = pd.DataFrame({ 
'ticker':np.repeat( ['aapl'], 2500 ),
'date':pd.date_range('1/1/2011', periods=2500, freq='D'),
'price':(np.random.randn(2500).cumsum() + 10) }).set_index('date')
aapl_df.head()

date        
2011-01-01  aapl    9.011290
2011-01-02  aapl    9.092603
2011-01-03  aapl    9.139830
2011-01-04  aapl    7.782112
2011-01-05  aapl    8.316270

using 'last' as stated yielded closer results but not sure if that is pure luck

aapl_Y_ret_df = aapl_df['price'].resample('Y').last()
aapl_Y_ret_df.tail()

output

    date
2013-12-31    18.535328
2014-12-31    15.201832
2015-12-31    36.040411
2016-12-31    42.272464
2017-12-31    20.421079
Freq: A-DEC, Name: price, dtype: float64

--

aapl_Y_ret_df = aapl_df['price'].resample('Y').last().pct_change()
aapl_Y_ret_df.tail()
date
2013-12-31    0.569359
2014-12-31   -0.179846
2015-12-31    1.370794
2016-12-31    0.172918
2017-12-31   -0.516918
Freq: A-DEC, Name: price, dtype: float64

Solution

    • This answers the question of how to calculate the yearly cumulative percent change of the 'Adj Close', which is what the OP wants.
    • Calculate the yearly return by finding the daily percent change in the Close or Adj Close, and then sum and multiply by 100.
    • Use groupby and DataFrameGroupBy.pct_change to get the values by year.
    • df['Adj Close'].resample('Y').mean() returns the mean of the 'Adj Close' values for each year, which is not how to determine the yearly return.
      • The percent change of the mean close from 2007 to 2008 is -17.4%. This is not the return.
    • Tested in python 3.11.2, pandas 2.0.0
    import yfinance as yf
    import pandas as pd
    
    # load S&P 500 data
    df = yf.download('^gspc', start='2000-01-01', end='2020-01-01').reset_index()
    
    # display(df)
            Date         High          Low         Open        Close      Volume    Adj Close
    0 2000-01-03  1478.000000  1438.359985  1469.250000  1455.219971   931800000  1455.219971
    1 2000-01-04  1455.219971  1397.430054  1455.219971  1399.420044  1009000000  1399.420044
    2 2000-01-05  1413.270020  1377.680054  1399.420044  1402.109985  1085500000  1402.109985
    3 2000-01-06  1411.900024  1392.099976  1402.109985  1403.449951  1092300000  1403.449951
    4 2000-01-07  1441.469971  1400.729980  1403.449951  1441.469971  1225200000  1441.469971
    
    # groupby year and determine the daily percent change by year, and add it as a column to df
    df['pct_ch'] = df.groupby(df.Date.dt.year)['Adj Close'].pct_change()
    
    # groupby year and aggregate sum of pct_ch to get the yearly return
    yearly_pct_ch = df.groupby(df.Date.dt.year)['pct_ch'].sum().mul(100).reset_index().rename(columns={'pct_ch': 'cum_pct_ch_year'})
    
    # display(yearly_pct_ch)
        Date  cum_pct_ch_year
    0   2000        -7.274088
    1   2001        -8.890805
    2   2002       -23.811947
    3   2003        21.552072
    4   2004         9.535574
    5   2005         4.295586
    6   2006        11.626670
    7   2007         4.860178
    8   2008       -38.702107
    9   2009        21.622674
    10  2010        12.052038
    11  2011         1.575069
    12  2012        11.840560
    13  2013        24.012739
    14  2014        12.320664
    15  2015         0.501799
    16  2016        11.494988
    17  2017        17.127082
    18  2018        -5.822426
    19  2019        26.031938