Search code examples
pythonpandasgroup-byfinance

groupby rolling agg custom function for portfolio beta


Thanks for reading and in advance for any answers.

Beta is a measure of systematic risk of an investment portfolio. It is calculated by taking the covariance of that portfolios returns against the benchmark / market and dividing it by the variance of the market. I'd like to calc this on a rolling basis against many portfolios.

I have a df as follows

PERIOD,PORT1,PORT2,BM
201504,-0.004,-0.001,-0.013
201505,0.017,0.019,0.022
201506,-0.027,-0.037,-0.039
201507,0.026,0.033,0.017
201508,-0.045,-0.054,-0.081
201509,-0.033,-0.026,-0.032
201510,0.053,0.07,0.09
201511,0.03,0.032,0.038
201512,-0.05,-0.034,-0.044
201601,-0.016,-0.043,-0.057
201602,-0.007,-0.007,-0.011
201603,0.014,0.014,0.026
201604,0.003,0.001,0.01
201605,0.046,0.038,0.031

Except with many more columns like port1 and port2.

I would like to create a dataset with a rolling beta vs the BM column.

I created a similar rolling correlation dataset with

df.rolling(3).corr(df['BM'])

...which took every column in my large set and calced a correlation vs my BM column.

I tried to make a custom function for Beta but because it takes two arguments I am struggling. Below is my custom function and how I got it to work by feeding it two columns of returns.

    def beta(arr1,arr2):
    #ddof = 0 gives population covar. the 0 and 1 coordinates take the arr1 vs arr2 covar from the matrix
    return (np.cov(arr1,arr2,ddof=0)[0][1])/np.var(arr2)

    beta_test = beta(df['PORT1'],df['BM'])

So this helps me find the beta between two columns that I feed in... question is how to do this for my data above and data with many columns/portfolios? And then how to do it on a rolling basis? From what I saw above with the correlation, the below should be possible, to run each rolling 3 month data set in each column vs one specified column.

beta_data = df.rolling(3).agg(beta(df['BM']))

Any pointer in the right direction would be appreciated


Solution

  • IIUC, you can set_index the columns PERIOD and BM, filter the column with PORT in it (in case you have other columns you don't want to apply the beta function), then use rolling.apply like:

    print (df.set_index(['PERIOD','BM']).filter(like='PORT')
             .rolling(3).apply(lambda x: beta(x, x.index.get_level_values(1)))
             .reset_index())
        PERIOD     BM     PORT1     PORT2
    0   201504 -0.013       NaN       NaN
    1   201505  0.022       NaN       NaN
    2   201506 -0.039  0.714514  0.898613
    3   201507  0.017  0.814734  1.055798
    4   201508 -0.081  0.736486  0.907336
    5   201509 -0.032  0.724490  0.887755
    6   201510  0.090  0.598332  0.736964
    7   201511  0.038  0.715848  0.789221
    8   201512 -0.044  0.787248  0.778703
    9   201601 -0.057  0.658877  0.794949
    10  201602 -0.011  0.412270  0.789567
    11  201603  0.026  0.354829  0.690573
    12  201604  0.010  0.562924  0.558083
    13  201605  0.031  1.716066  1.530471