Search code examples
pythonpython-3.xpandasnonetype

Make Most Columns None If Condition in Python


I have a pandas dataframe with one boolean column. I want to set most of the columns to "None" if the boolean is False. I don't want to change the object that I am iterating over, but what is an elegant way to do this?


Solution

  • You can use the apply function to iterate over the rows:

    import pandas as pd
    import random
    
    df = pd.DataFrame(
       {"a": list(range(10)), 
        "b": list(range(10)), 
        "c": [random.choice((True, False)) for _ in range(10)]
    })
    
    def mask(row):
       if row["c"]:
          row["a"] = None
       return row
    
    df.apply(mask, axis=1)
    

    Yields the following result:

       a    b      c
    0  0  0.0  False
    1  1  1.0  False
    2  2  NaN   True
    3  3  3.0  False
    4  4  NaN   True
    5  5  NaN   True
    6  6  NaN   True
    7  7  NaN   True
    8  8  NaN   True
    9  9  NaN   True
    

    Meanwhile the original dataframe is unchanged.

    This is certainly not the fastest method, but its very flexible. If speed is an important factor you might also consider the where method.

    cols_to_replace = ["a", "b"]
    new_col_names = ["new_" + col for col in cols_to_replace]
    df[new_col_names] = df[cols_to_replace].where(df["c"])