Search code examples
pythonpandasmulti-index

Subsetting multiindex columns


I have a dataframe with multiindex hierarchical colmn names, including empty strings as column index names. How to subset second and third columns?

import numpy as np
import pandas as pd

df = pd.DataFrame(np.arange(15).reshape(5,3),
                         index=[1,2,3,4,5],
                         columns=[['A', 'A', 'B'],
                                  ['a', 'b', ''],
                                  ['',  'x', '']]
)
df.columns.names = ["c_ix0", "c_ix1", "c_ix2"]
print(df)

c_ix0   A       B
c_ix1   a   b    
c_ix2       x    
1       0   1   2
2       3   4   5
3       6   7   8
4       9  10  11
5      12  13  14

expected output:

c_ix0      A   B
c_ix1      b    
c_ix2      x    
1          1   2
2          4   5
3          7   8
4         10  11
5         13  14

Solution

  • I believe you need xs:

    a = df.xs('b', axis=1, level=1)
    print (a)
    c_ix0   A
    c_ix2   x
    1       1
    2       4
    3       7
    4      10
    5      13
    
    b = df.xs('B', axis=1, level=0)
    print (b)
    c_ix1    
    c_ix2    
    1       2
    2       5
    3       8
    4      11
    5      14
    

    If want select by positions use iloc:

    c = df.iloc[:, 1]
    print (c)
    1     1
    2     4
    3     7
    4    10
    5    13
    Name: (A, b, x), dtype: int32
    

    EDIT:

    d = df.iloc[:, [1, 2]]
    print (d)
    c_ix0   A   B
    c_ix1   b    
    c_ix2   x    
    1       1   2
    2       4   5
    3       7   8
    4      10  11
    5      13  14