Search code examples
pythonpandasdataframecomparison

Sum of columns over threshold in pandas


I'm trying to sum a bunch of columns in pandas and check whether that sum is over 100 or not.

I already have the sum part sorted, what I'm trying to find is a way to compare each value of the sum to a scalar.

Here's my first attempt:

df[[col1,col2]].sum(axis=1) > 100.0

this gave back ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all()

I've tried with a.any and a.all, but they both return only true or false, so that doesn't work. I've also tried creating a pandas series with the value, but it gives error too.


Solution

  • Doing

    (df[[col1,col2]].sum(axis=1) > 100.0).any()

    returns true if any row in the sum is over 100 and False otherwise

    Thanks to Dogbert for providing this answer in the comments