Search code examples
pandasdataframecomparisonmulti-index

Using 'gt' with two multindex datframes


I asked a question recently and Timeless replied with a fascinating bit of code that did what its supposed to do.

Basicaly, Im working with a multiindex timeseries df from yahoo finance, using level 0 indexes as 'Adj Close' and 'High'. Level 1 indexes are a list of company codes.

Ive tried to adapt the original code and Im getting in a real mess. This is my adaptation:

def indicators_df(df):
    def xs(df, m):
        return df.xs(m, axis=1, drop_level=False)

    def rn(df, d):
        return df.rename(d, axis=1, level=0)

    tmp1 = df.pipe(xs, "Adj Close")    
    tmp2 = df.pipe(xs, "High")
    
    chk = tmp2.gt(tmp.shift()).pipe(rn, {"High": "Check"})

    return chk

The original code was:

chk = tmp2.eq(tmp2.shift()).pipe(rn, {"High": "Check"})

It returns True if the current row's 'High' is the same as the previous row's 'High'

My adaptation makes the following changes:

1) I add in:

    tmp1 = df.pipe(xs, "Adj Close")

2) I change chk to:

    chk = tmp2.gt(tmp1.shift()).pipe(rn, {"High": "Check"})

The objective is to get a True if the current row's High is GREATER THAN the previous row's 'Adj Close'.

This is what I get:

Adj Close Check
ITUB4.SA PETR4.SA VALE3.SA ITUB4.SA PETR4.SA VALE3.SA
Date
11-09-2023 False False False False False False

There are 2 problems:

  1. Why am I getting the Adj Close columns as well as the Check column?

  2. All the values are False. In the actual df, many values of the high are greater than the shifted close value.

I dont understand why the df.eq(df.shifted) works fine but not df.gt(otherdf.shifted).

Anyone shed a light on this?

Thanks!


Solution

  • Why am I getting the Adj Close columns as well as the Check column?

    Because the first level of columns has a different name ('High' vs 'Adj Close') and Pandas can't align their indexes:

    >>> tmp2.gt(tmp1.pipe(rn, {"Adj Close": "High"})).pipe(rn, {"High": "Check"})
    
               Check            
                  C1    C2    C3
    Date                        
    02-01-2020  True  True  True
    03-01-2020  True  True  True