Search code examples
pandasdataframereplacewhere-clause

Replace value in pandas dataframe based on where condition


I have created a dataframe called df with this code:

import numpy as np
import pandas as pd

# initialize data of lists.
data = {'Feature1':[1,2,-9999999,4,5],
        'Age':[20, 21, 19, 18,34,]}
 
# Create DataFrame
df = pd.DataFrame(data)
print(df)

The dataframe looks like this:

   Feature1  Age
0         1   20
1         2   21
2  -9999999   19
3         4   18
4         5   34

Every time there is a value of -9999999 in column Feature1 I need to replace it with the correspondent value from column Age. so, the output dataframe would look this this:

   Feature1  Age
0         1   20
1         2   21
2        19   19
3         4   18
4         5   34

Bear in mind that the actual dataframe that I am using has 200K records (the one I have shown above is just an example).

How do I do that in pandas?


Solution

  • You can use np.where or Series.mask

    df['Feature1'] = df['Feature1'].mask(df['Feature1'].eq(-9999999), df['Age'])
    # or
    df['Feature1'] = np.where(df['Feature1'].eq(-9999999), df['Age'], df['Feature1'])