I am trying to get an additional column ['readmit'] that gives the time elapsed in days since the indexed admission discharge date(first occurance) for each new episode of the same keys admit date.
df = pd.DataFrame({'Key': ['10003', '10003', '10003', '10003', '10003','10003','10034', '10034', '10034', '10034'],
'admit': [20120506, 20120508, 20121014,20121016,20121020,20121110,20120506,20120517, 20120601, 20120606],
'discharge': [20120508, 20120510, 20121015,20121018,20121023,20121118,20120520,20120520, 20120605, 20120620]})
df['admit'] = pd.to_datetime(df['admit'], format='%Y%m%d')
df['discharge'] = pd.to_datetime(df['discharge'], format='%Y%m%d')
Key admit discharge
0 10003 2012-05-06 2012-05-08
1 10003 2012-05-08 2012-05-10
2 10003 2012-10-14 2012-10-15
3 10003 2012-10-16 2012-10-18
4 10003 2012-10-20 2012-10-23
5 10003 2012-11-10 2012-11-18
6 10034 2012-05-06 2012-05-20
7 10034 2012-05-21 2012-05-26
8 10034 2012-06-01 2012-06-05
9 10034 2012-06-09 2012-06-20
final df should look like this:
Key admit discharge readmit
0 10003 2012-05-06 2012-05-08 0
1 10003 2012-05-08 2012-05-10 0
2 10003 2012-10-14 2012-10-15 159
3 10003 2012-10-16 2012-10-18 161
4 10003 2012-10-20 2012-10-23 165
5 10003 2012-11-10 2012-11-18 186
6 10034 2012-05-06 2012-05-20 0
7 10034 2012-05-21 2012-05-26 1
8 10034 2012-06-01 2012-06-05 12
9 10034 2012-06-09 2012-06-20 20
This code does not work:
df['readmit']= (df['admit'].groupby(df.Key) - df['discharge'].groupby(df.Key).nth[0]).dt.days
You can subtract column by Series.sub
with first values per grousp by GroupBy.transform
and GroupBy.first
, get days by Series.dt.days
and convert negative values to 0
by Series.clip
:
df['readmit'] = (df['admit'].sub(df.groupby('Key')['discharge'].transform('first'))
.dt.days
.clip(lower=0))
print (df)
Key admit discharge readmit
0 10003 2012-05-06 2012-05-08 0
1 10003 2012-05-08 2012-05-10 0
2 10003 2012-10-14 2012-10-15 159
3 10003 2012-10-16 2012-10-18 161
4 10003 2012-10-20 2012-10-23 165
5 10003 2012-11-10 2012-11-18 186
6 10034 2012-05-06 2012-05-20 0
7 10034 2012-05-17 2012-05-20 0
8 10034 2012-06-01 2012-06-05 12
9 10034 2012-06-06 2012-06-20 17