I have a dataframe like as shown below
df1 = pd.DataFrame({'person_id': [11, 21, 31, 41, 51],
'date_1': ['12/30/1961', '05/29/1967', '02/03/1957', '7/27/1959', '01/13/1971'],
'backward': [0,120,1,2,3],
'forward': [0,12,1,0,0]})
df1['date_1'] = pd.to_datetime(df1['date_1'])
What I would like to do is
1) Shift forward date_1
column based on values from the forward
column (add dates).
2) if forward
column has a value of 0
, then pick the value from backward
and shift the date backward
(subtract dates)
3) if both forward
and backward
are 0
, it will not make any difference to dates as we add/subtract 0 days to date
. For coding purposes, if you still wanna know which column to pick then we can consider the backward
column.
This is what I tried
df1['new_date'] = df1['date_1'].sub(pd.to_timedelta(df1['forward'],unit='d'))
df1['bool'] = df1['date_1'].eq(df1['new_date']) # i do this to find out which rows aren't changed. So I can look at backward column to subtract days for these records
df1.loc[df1.bool == True, "new_date"] = df1['date_1'].sub(pd.to_timedelta(df1['backward'],unit='d'))
I expect my output to be like as shown below. As you can see for the last 2 rows and the 1st row, the forward
column had a value of 0, so we pick values from the backward
column and subtract the dates
you can use mask
on forward to replace the 0s by minus backward column and then add to_timedelta
this series to the date
df1['new_date'] = (df1['date_1']
+ pd.to_timedelta(df1['forward'].mask(lambda x: x.eq(0),
-df1['backward']),
unit='d')
)
print (df1)
person_id date_1 backward forward new_date
0 11 1961-12-30 0 0 1961-12-30
1 21 1967-05-29 120 12 1967-06-10
2 31 1957-02-03 1 1 1957-02-04
3 41 1959-07-27 2 0 1959-07-25
4 51 1971-01-13 3 0 1971-01-10