Search code examples
pythonpandasfinance

Tail dependence matrix with pandas


Given a time series of returns of 3 assets like this:

             Asset 1     Asset 2          Asset 3
date                                             
2018-08-09  0.004475    0.008847         0.006195
2018-08-10  0.025081    0.006637         0.013696
2018-08-13 -0.003005    0.011540         0.003559
2018-08-14 -0.012873   -0.009074        -0.005245
2018-08-15  0.003383    0.004275         0.000268

...

I need to calculate a matrix that indicates the probability that, if one asset's is at the tail of it's return distribution, the other is too. I need an outuput like this:

                   Asset1      Asset2        Asset3
         Asset1  1.000000     0.00000      0.073171
         Asset2  0.000000     1.00000      0.048780
         Asset3  0.046875     0.04878      1.000000

The interpretation is, if Asset 1 is at it's tail, Asset 3 has a probability 4.6% of being at it's tail too. The code that I wrote is this:

import pandas as pd
"""r is the dataframe of time series returns in the format presented."""
def dummy_tail_dependence(r):
    x = r[r<r.quantile(q=0.025)]
    df = pd.DataFrame(index = r.columns)
    for asset in list(r.columns):
        df[asset] = (x.dropna(subset=[asset]).count())/(x[asset].count())
    return df

I do not want to depend on loops to achieve the desired output. Is there a more numerical way of doing this using pandas OR numpy? With matrices operations?


Solution

  • Your example is very unrepresentative (there is only one row where any asset is in its tail). Yet, the following solution may be of help.

    Start by finding all cells where the corresponding asset is in its tail:

    in_tail = df < df.quantile(q=0.025)
    

    Now, calculate the mean value of each asset for each other asset being in its tail. The mean of a boolean series is the probability of the series being true:

    probs = pd.concat([in_tail[in_tail[col]==True].mean() 
                       for col in in_tail], axis=1)
    probs.columns = probs.index
    #        Asset1  Asset2  Asset3
    #Asset1     1.0     1.0     1.0
    #Asset2     1.0     1.0     1.0
    #Asset3     1.0     1.0     1.0
    

    (Yes, the example is poor.)