Search code examples
pandaschaining

Pandas Series Chaining: Filter on boolean value


How can I filter a pandas series based on boolean values?

Currently I have:

s.apply(lambda x: myfunc(x, myparam).where(lambda x: x).dropna()

What I want is only keep entries where myfunc returns true.myfunc is complex function using 3rd party code and operates only on individual elements.

How can i make this more understandable?


Solution

  • Use boolean indexing:

    mask = s.apply(lambda x: myfunc(x, myparam))
    print (s[mask])
    

    If also is changed index values in mask filter by 1d array:

    #pandas 0.24+
    print (s[mask.to_numpy()])
    
    #pandas below
    print (s[mask.values])
    

    EDIT:

    s = pd.Series([1,2,3])
    
    def myfunc(x, n):
        return x > n
    
    myparam = 1
    a = s[s.apply(lambda x: myfunc(x, myparam))]
    print (a)
    1    2
    2    3
    dtype: int64
    

    Solution with callable is possible, but a bit overcomplicated in my opinion:

    a = s.loc[lambda s: s.apply(lambda x: myfunc(x, myparam))]
    print (a)
    1    2
    2    3
    dtype: int64