Search code examples
pythonpandasdataframemulti-index

How to skip a column level in a multi index dataframe?


I have a nested multi-index dataframe with 3 levels and I want to select the 1st and 3rd level without selecting the 2nd level. Is it possible to do that?

My current code looks like this (updated). now I am getting TypeError: unhashable type: 'slice' for panda which I am not sure why. 'D' and 'E' are the columns that I want to select from the 3rd level.

   df1_1 = data['A'].loc['B',:,'D','E']

Solution

  • Using a sample dataframe from pandas docs You can use loc with axis parameter or slicers or IndexSlice:

    Setup:

    def mklbl(prefix,n):
        return ["%s%s" % (prefix,i)  for i in range(n)]
    
    miindex = pd.MultiIndex.from_product([mklbl('A',4),
                                            mklbl('B',2),
                                          mklbl('C',4),
                                           mklbl('D',2)])
    
    micolumns = pd.MultiIndex.from_tuples([('a','foo'),('a','bar'),
                                            ('b','foo'),('b','bah')],
                                           names=['lvl0', 'lvl1'])
    
    
    dfmi = pd.DataFrame(np.arange(len(miindex)*len(micolumns)).reshape((len(miindex),len(micolumns))),
                        index=miindex,
                         columns=micolumns).sort_index().sort_index(axis=1)
    
    
    dfmiT = dfmi.T.copy()
    
    dfmiT
    

    Input dataframe:

              A0                                    ...    A3                      \
              B0                             B1     ...    B0        B1             
              C0     C1      C2      C3      C0     ...    C3        C0        C1   
              D0 D1  D0  D1  D0  D1  D0  D1  D0  D1 ...    D0   D1   D0   D1   D0   
    lvl0 lvl1                                       ...                             
    a    bar   1  5   9  13  17  21  25  29  33  37 ...   217  221  225  229  233   
         foo   0  4   8  12  16  20  24  28  32  36 ...   216  220  224  228  232   
    b    bah   3  7  11  15  19  23  27  31  35  39 ...   219  223  227  231  235   
         foo   2  6  10  14  18  22  26  30  34  38 ...   218  222  226  230  234   
    

    Let's use .loc with axis parameter:

    dfmiT.loc(axis=1)['A3',:,'C2']
    

    Output:

                A3               
                B0        B1     
                C2        C2     
                D0   D1   D0   D1
    lvl0 lvl1                    
    a    bar   209  213  241  245
         foo   208  212  240  244
    b    bah   211  215  243  247
         foo   210  214  242  246