Search code examples
pandasconditional-statementsdrop

Drop a column based on the existence of another column


I'm actually trying to figure out how to drop a column based on the existence of another column. Here is my problem :

I start with this DataFrame. Each "X" column is associated with a "Y" column using a number. (X_1,Y_1 / X_2,Y_2 ...)

  Index X_1 X_2 Y_1  Y_2                
    1    4   0   A   NaN
    2    7   0   A   NaN
    3    6   0   B   NaN
    4    2   0   B   NaN
    5    8   0   A   NaN

I drop NaN values using pd.dropna(). The result I get is this DataFrame :

  Index X_1 X_2 Y_1             
    1    4   0   A   
    2    7   0   A
    3    6   0   B
    4    2   0   B
    5    8   0   A

The problem is that I want to delete the "X" column associated to the "Y" column that just got dropped. I would like to use a condition that basically says :

"If Y_2 is not in the DataFrame, drop the X_2 column"

I used a for loop combined to if, but it doesn't seem to work. Any ideas ?

Thanks and have a good day.


Solution

  • Setup

    >>> df
    
           CHA_COEXPM1_COR  CHA_COEXPM2_COR CHA_COFMAT1_COR  CHA_COFMAT2_COR
    Index                                                                   
    1                    4                0               A              NaN
    2                    7                0               A              NaN
    3                    6                0               B              NaN
    4                    2                0               B              NaN
    5                    8                0               A              NaN
    

    Solution

    • Identify the columns having NaN values in any row
    • Group the identified columns using the numeric identifier and transform using any
    • Filter the columns using the boolean mask created in the previous step
    m = df.isna().any()
    m = m.groupby(m.index.str.extract(r'(\d+)_')[0]).transform('any')
    

    Result

    >>> df.loc[:, ~m]
    
           CHA_COEXPM1_COR CHA_COFMAT1_COR
    Index                                 
    1                    4               A
    2                    7               A
    3                    6               B
    4                    2               B
    5                    8               A