Search code examples
pythonpandasindexingconditional-statementsmulti-index

pandas multi-index select data by logical arrays


here i has a dataframe

import pandas as pd
import numpy as np
data = pd.DataFrame(np.arange(100).reshape(20,5), columns=list('abcde'))
data = data.set_index(['a', 'b'])
print(data.head())
        c   d   e
a  b
0  1    2   3   4
5  6    7   8   9
10 11  12  13  14
15 16  17  18  19
20 21  22  23  24

How can i select data use loc functions by the logical judge? Like this: a<20 and b > 1


Solution

  • You can use Index.get_level_values with boolean indexing:

    mask = (data.index.get_level_values('a') < 20) & (data.index.get_level_values('b') > 1)
    print (mask)
    [False  True  True  True False False False False False False False False
     False False False False False False False False]
    
    print (data[mask])
            c   d   e
    a  b             
    5  6    7   8   9
    10 11  12  13  14
    15 16  17  18  19
    

    Solution with DataFrame.query is similar as another answer, only changed & to and:

    print (data.query("a < 20 and b > 1"))
            c   d   e
    a  b             
    5  6    7   8   9
    10 11  12  13  14
    15 16  17  18  19
    

    Query python versus pandas syntax comparison.