Search code examples
pythonpandasindexingbooleanconditional-statements

How to do boolean slicing in Pandas


I have a dataframe. I would like to slice based on a comparison between the 2 columns. For instance in the following example I would like to extract rows where column x is larger than column y:

d = pd.DataFrame({'x':[1, 2, 3, 4, 5], 'y':[4, 5, 6, 7, 8]})

d[d[:"x"]>d[:"y"]]

By doing so I get an error:

"unhashable type: 'slice'"


Solution

  • You need omit : and use boolean indexing:

    d[d["x"]>d["y"]]
    

    Sample (changed last value):

    d = pd.DataFrame({'x':[1, 2, 3, 4, 5], 'y':[4, 5, 6, 7, 3]})
    print (d)
       x  y
    0  1  4
    1  2  5
    2  3  6
    3  4  7
    4  5  3
    
    print (d["x"]>d["y"])
    0    False
    1    False
    2    False
    3    False
    4     True
    dtype: bool
    
    print (d[d["x"]>d["y"]])
       x  y
    4  5  3