Search code examples
pythonpandasindexingmulti-index

Python Pandas: How to add another name of multiindex?


I was playing around crypto data in pandas. After merging several dataframes, I got this

                                            
       timestamp     open      high       low     close   volume    open      high       low      close    volume  
0  1620202740000  54945.31  54987.01  54945.30  54978.49  118.239  54945.31  54987.01  54945.30  54978.49  4345
1  1620202800000  54978.49  55054.00  54972.04  55027.12  337.619  54945.31  54987.01  54945.30  54978.49  134.239   
2  1620202860000  55027.12  55041.05  54950.05  54951.96  131.414  54945.31  54987.01  54945.30  54978.49  14358.239
3  1620202920000  54951.96  55067.36  54951.95  55063.78  176.529  54945.31  54987.01  54945.30  54978.49  1148.239
4  1620202980000  55063.79  55064.00  55000.00  55014.39  107.082  54945.31  54987.01  54945.30  54978.49  18.239

I want to add another level of index on top, so it would be like

                      btc       btc        btc     btc     btc      eth        eth       eth       eth     eth                      
       timestamp     open      high       low     close   volume    open      high       low      close    volume  
0  1620202740000  54945.31  54987.01  54945.30  54978.49  118.239  54945.31  54987.01  54945.30  54978.49  4345
1  1620202800000  54978.49  55054.00  54972.04  55027.12  337.619  54945.31  54987.01  54945.30  54978.49  134.239   
2  1620202860000  55027.12  55041.05  54950.05  54951.96  131.414  54945.31  54987.01  54945.30  54978.49  14358.239
3  1620202920000  54951.96  55067.36  54951.95  55063.78  176.529  54945.31  54987.01  54945.30  54978.49  1148.239
4  1620202980000  55063.79  55064.00  55000.00  55014.39  107.082  54945.31  54987.01  54945.30  54978.49  18.239

So it will be easy to me to add more columns like this:

for x in ['btc', 'eth']:
    df.loc[:, (x, 'fast_ema_1min')] = df[x]['close'].rolling(window=1).mean()
    df.loc[:, (x, 'slow_ema_20min')] = df[x]['close'].rolling(window=20).mean()

Can someone advise? Thanks.


Solution

  • You can create a MultiIndex like this in a couple of ways:

    new_columns = pd.MultiIndex.from_arrays([
        (["btc"] * 5) + (["eth"] * 5), 
        df.columns[1:] # exclude "timestamp" from our new columns
    ])
    
    new_df = df.set_index("timestamp").set_axis(new_columns, axis=1)
    
    print(new_df)
                        btc                                              eth
                       open      high       low     close   volume      open      high      low     close     volume
    timestamp
    1620202740000  54945.31  54987.01  54945.30  54978.49  118.239  54945.31  54987.01  54945.3  54978.49   4345.000
    1620202800000  54978.49  55054.00  54972.04  55027.12  337.619  54945.31  54987.01  54945.3  54978.49    134.239
    1620202860000  55027.12  55041.05  54950.05  54951.96  131.414  54945.31  54987.01  54945.3  54978.49  14358.239
    1620202920000  54951.96  55067.36  54951.95  55063.78  176.529  54945.31  54987.01  54945.3  54978.49   1148.239
    1620202980000  55063.79  55064.00  55000.00  55014.39  107.082  54945.31  54987.01  54945.3  54978.49     18.239
    

    Alternatively, you can use MultiIndex.from_product like so:

    new_columns = pd.MultiIndex.from_product([
        ["btc", "eth"], 
        ["open", "high", "low", "close", "volume"]
    ])
    
    # same as above
    new_df = df.set_index("timestamp").set_axis(new_columns, axis=1)