Search code examples
pythonpandasdataframemulti-index

Pandas selection by MultiIndex


Question about selecting rows by MultiIndex.

df:

              colA   ColB

   A1   B1   1  ca1  cb1
             2  ca2  cb2
        B2   1  ca3  cb3
             2  ca4  cb4 
   A2   B1   1  ca5  cb5
             2  ca6  cb6
        B2   1  ca7  cb7
             2  ca8  cb8

I want to select colA from A2/B1 where MultiIndex levels(2) > 1. I'm doing it in multiple steps:

df1=df.loc[A2,[B7],:,]  

for some reason I have to present level(1) in brackets

df1.index = x1.index.droplevel(0)

df1.index = x1.index.droplevel(0)

df1.loc[1:,'colA']

I know it is way too much but I cant figure how to do it simpler. Ways I can think of don't want to work.


Solution

  • You can also do it in two operations. First use loc to select parts in A2 and B1, and then do another query at the remaining index, by setting query('index > 1').

    df = pd.DataFrame({'ColB': {('A1', 'B1', 1): 'cb1',
      ('A1', 'B1', 2): 'cb2',
      ('A1', 'B2', 1): 'cb3',
      ('A1', 'B2', 2): 'cb4',
      ('A2', 'B1', 1): 'cb5',
      ('A2', 'B1', 2): 'cb6',
      ('A2', 'B2', 1): 'cb7',
      ('A2', 'B2', 2): 'cb8'},
     'colA': {('A1', 'B1', 1): 'ca1',
      ('A1', 'B1', 2): 'ca2',
      ('A1', 'B2', 1): 'ca3',
      ('A1', 'B2', 2): 'ca4',
      ('A2', 'B1', 1): 'ca5',
      ('A2', 'B1', 2): 'ca6',
      ('A2', 'B2', 1): 'ca7',
      ('A2', 'B2', 2): 'ca8'}})
    
    # Answer here
    df.loc["A2", "B1"].query("index > 1")
    Output:
        ColB    colA
     2  cb6     ca6