Search code examples
pythonpython-3.xlinear-regressionstatsmodelsquantitative-finance

Why don't these betas match?


I also asked on Quant Finance, but I thought someone could help here also: https://quant.stackexchange.com/questions/49099/why-dont-these-betas-match

I would expect my portfolio beta when regressed against the market to match my individual component betas multiplied by the portfolio weights. I have created a simple example below. Any help in explaining where I have gone wrong would be much appreciated.

import pandas as pd
    import numpy as np
    import statsmodels.api as sm
    from statsmodels import regression

    def beta(x, y):
        x = sm.add_constant(x)
        model = regression.linear_model.OLS(y, x).fit()
        # Remove the constant now that we're done
        x = x[:, 1]
        return model.params[1]


    bond_one = [100, 96, 102, 88, 96, 101, 120, 110, 105, 107, 106]
    bond_two = [98, 102, 88, 95, 105, 100, 101, 99, 104, 108, 112]
    mkt = [1000, 1004, 1000, 1010, 1020, 1000, 990, 995, 1005, 1025, 1035]

    df_mkt = pd.DataFrame(mkt, columns = ['mkt'])
    df_mkt = df_mkt.pct_change().dropna()
    df = pd.DataFrame(bond_one, columns = ['bond_one'])
    df['bond_two'] = bond_two

    df_price = df.copy()
    df = df.pct_change().dropna()

    notionals = {'bond_one': 2500000,
                    'bond_two': 6500000}

    mkt_values = {key: value*(df_price[key].iloc[-1]/100)
                  for (key, value) in notionals.items()}

    #create portfolio market value
    tot_port = sum(list(mkt_values.values()))
    #generate weights
    wts = {key: value/tot_port for (key, value) in mkt_values.items()}

    #create portfolio returns
    df_port = df.copy()*0
    df_port = df.mul(list(wts.values()), axis=1)
    df_port['port'] = df_port.sum(axis=1)

    #add port and market into original dataframe
    df['port'] = df_port['port'].copy()
    df['mkt'] = df_mkt['mkt'].copy()

    #run OLS on individuals and portfolio
    b1_beta = regression.linear_model.OLS(x = df['bond_one'].values, y=df['mkt'].values).fit()
    b2_beta = beta(x=df['bond_two'].values, y=df['mkt'].values)
    port_beta = beta(x=df['port'].values, y=df['mkt'].values)

    calc_beta = wts['bond_one']*b1_beta + wts['bond_two']*b2_beta
    ###why don't calc_beta and port_beta match?

Solution

  • The difference relates to the presence (or lack thereof) of portfolio weights in the regression. Because the value of your portfolio constituents changes daily, so do the weights. port_beta is the beta of your portfolio value across time to market whereas calc_beta is the weighted sum of beta across portfolio constituents. The difference arises primarily due to the fact that calc_beta is computed using the current weights whereas port_beta is calculated across the historical weights.