Search code examples
pythonpandasdataframereplacedifference

Replace values in dataframe with difference to last row value by condition


I'm trying to replace every value above 1000 in my dataframe by its difference to the previous row value.

This is the way I tried with pandas:

data_df.replace(data_df.where(data_df["value"] >= 1000), data_df["value"].diff(), inplace=True)

This does not result in an error, but nothing in the dataframe changes. What am I missing?


Solution

  • import numpy as np
    import pandas as pd
    
    d = {'value': [1000, 200002,50004,600005], }
    data_df = pd.DataFrame(data=d)
    
    data_df["diff"] =  data_df["value"].diff()
    data_df["value"] = np.where((data_df["value"]>10000) ,data_df["diff"],data_df["value"])
    
    data_df.drop(columns='diff', inplace=True)
    

    I introduce one column "diff" to get the difference of pervious row. np.where allow u implement the if else statement.

    Hope it helps u thanks!