Say you have a list of pandas Series objects, each series being of boolean dtype
:
boolean_series_list = [s1, s2, s3, ..., sn]
You have another series s
which has the same index as all the boolean series in boolean_series_list
, and you want to index it to return only values for which True
appears at the corresponding index of any of the series in boolean_series_list
. How do you do that?
I know the |
operator can be used to combine such series:
s[s1|s2]
but how do you do this for the entire list of such series without manually rolling it out into s[s1|s2|s3|...|sn]
? Something like:
cond = boolean_series_list[0]
for series in boolean_series_list[1:]:
cond = cond | series
s[cond]
works, but it seems relatively clunky considering the typically neat high-level interface tha Pandas provides for interacting with Boolean series, like use of the boolean operator |
and others in the first place, even though Series objects aren't actual booleans. With actual booleans in Python, you can just use the built-in any()
function, but any(boolean_series_list)
returns:
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
so is there any neat equivalent to any()
for Pandas objects?
Similar question for the &
operator, which for actual booleans is served by the built-in all()
, etc.
You can first gather all the boolean series into one dataframe, then use any
to aggregate the conditions into one series:
cond = pd.concat(boolean_series_list, axis=1).any(axis=1)
s[cond]
To get the equivalent of using chained &
operators you can similarly use all
instead of any
.