Search code examples
pythonpandasdataframeconditional-statementsmask

Impose OR on a list of conditions (masks) python pandas


I have a dataframe of the following type:

                  dummy1  dummy2  dummy3  ...  dummy8  dummy9  dummy10
Date       ID                             ...                        
1998-01-01 X       1        NaN      NaN  ...     NaN     NaN     NaN
           Y       1        NaN      NaN  ...     NaN     NaN     NaN
1998-01-02 X       NaN      NaN      NaN  ...     NaN     NaN     NaN
           Y       NaN      NaN      NaN  ...     NaN     NaN     NaN
1998-01-05 X       NaN      NaN      NaN  ...     NaN     NaN     NaN
                   ...      ...      ...  ...     ...     ...     ...
2016-12-27 Y       NaN        1      NaN  ...     NaN     NaN     NaN
2016-12-28 X       NaN        1      NaN  ...     NaN     NaN     NaN
           Y       NaN      NaN      NaN  ...     NaN     NaN     NaN
2016-12-29 X       NaN      NaN      NaN  ...     NaN     1       NaN
           Y       NaN      NaN      NaN  ...     NaN     1       NaN

Now, I have a list of Boolean series which correspond to specifc masks for the dataframe above which I call mask. This means mask contains several elements (in particular: one for every dummy of the dataframe) of the type of, say, mask[0] which is:

Date        Index
 1998-01-01  X      True
             Y      True
 1998-01-02  X     False
             Y     False
 1998-01-05  X     False
                      ...  
 2016-12-27  Y     False
 2016-12-28  X     False
             Y     False
 2016-12-29  X     False
             Y     False

Now, regardless of how I built the mask series (they are True if the corresponding dummy is 1 in the dataframe), I would like to apply the following command:

df_new=df[masks[0] | masks[1] | masks[2] | masks[3] | masks[4] | masks[5] | masks[6] | masks[7] | masks[8] | masks[9]]

which means I want to apply all the masks contained in the list mask with the OR operator, all at he same time, on my dataframe df. How do I do it in an 'authomatic' way without the need of manually specifying all the elements of mask? It is very important or me to be able to 'automize' this part as part of a function that creates a number of different mask elements based on the number of dummy in the columns of the dataframe


Solution

  • How about creating a function that iterates through a list of masks?

    def filter_many_or(list_of_masks):
        aggregate_mask = list_of_masks[0]
    
        for mask in list_of_masks[1:]:
            aggregate_mask = aggregate_mask | mask
    
        return aggregate_mask