Search code examples
pythonpandasdataframepredicate

DataFrame change values of multiple columns according to provided predicate


Given:

from pandas import DataFrame
import pandas as pd

d = {'x':[2,3,1,4,5],
     'y':[5,4,3,2,1],
     'letter':['a','a','b','b','c']}

df = DataFrame(d)

And some function p that takes 2 arguments and returns a boolean value.
I would like to have something like:

result = df[['x', 'y']].apply(f)

And get a boolean vector, according to predicate function f.
For example, if f = lambda x, y: x > 3 and y < 3, result should be equal to [False, False, False, True, True].
Is there a nice and simple way to do that? I could not yet find a solution.


Solution

  • You don't need apply here. Bit-wise logic will get you what you need:

    import pandas
    
    d = {'x':[2,3,1,4,5],
         'y':[5,4,3,2,1],
         'letter':['a','a','b','b','c']}
    
    df = (
        pandas.DataFrame(d)
            .assign(
                condition=lambda df: (df['x'] > 3) & (df['y'] < 3)
            )
    )
    df
    

    And that gives me:

       x  y letter  condition
    0  2  5      a      False
    1  3  4      a      False
    2  1  3      b      False
    3  4  2      b       True
    4  5  1      c       True