Search code examples
pandaslambdareplaceapplystartswith

Replace values in DataFrame column when they start with string using lambda


I have a DataFrame:

import pandas as pd
import numpy as np
x = {'Value': ['Test', 'XXX123', 'XXX456', 'Test']}
df = pd.DataFrame(x)

I want to replace the values starting with XXX with np.nan using lambda.

I have tried many things with replace, apply and map and the best I have been able to do is False, True, True, False.

The below works, but I would like to know a better way to do it and I think the apply, replace and a lambda is probably a better way to do it.

df.Value.loc[df.Value.str.startswith('XXX', na=False)] = np.nan

Solution

  • use the apply method

    In [80]: x = {'Value': ['Test', 'XXX123', 'XXX456', 'Test']}
    In [81]: df = pd.DataFrame(x)
    In [82]: df.Value.apply(lambda x: np.nan if x.startswith('XXX') else x)
    Out[82]:
    0    Test
    1     NaN
    2     NaN
    3    Test
    Name: Value, dtype: object
    

    Performance Comparision of apply, where, loc enter image description here