I have a pandas dataframe which has cumulative values like this:
import pandas as pd
import numpy as np
data = {'Date':['2017-01-27','2017-01-28','2017-01-29','2017-01-30','2017-01-31','2017-02-01','2017-02-02','2017-02-03','2017-02-04','2017-02-05'],
'Kentucky':[1,4,5,7,np.NaN,np.NaN,10,np.NaN,12,13],
'New York':[np.NaN,2,np.NaN,7,np.NaN,np.NaN,np.NaN,np.NaN,9,np.NaN],
'California':[np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,np.NaN,1,np.NaN,2,np.NaN],
'Maine':[np.NaN,2,3,7,np.NaN,np.NaN,10,np.NaN,12,13]}
df = pd.DataFrame(data)
How can I convert the cumulative values to incremental values if the dataframe has null values? I have tried incremental values conversion solution but it does not produce the desired result.
First convert to a data frame:
data = pd.DataFrame(data)
Next use fillna's forward fill to account for the NaN issue. Below shows the code for Maine, but could be applied to the other states:
data['Maine_Inc_Values'] = data['Maine'].fillna(method='ffill').diff()
Then use .loc to correct for the first incremental value (which returns a NaN and not 2):
data.loc[data['Maine_Inc_Values'].isnull(), 'Maine_Inc_Values'] = data['Maine']
Results in: