I have following two columns in my data frame called Entry_date
, Death_date
containing dates in format of YYYY/MM/DD
. I want to subtract like (Death_date-Entry_date = survival_days)
. After subtracting Death_date
from Entry_date
, i want my outcome in days. My data looks like following.
Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Entry_date<-c(2010/04/13, 2008/07/30, 2009/03/06, 2008/08/22, 2009/06/24, 2008/08/26)
Death_date<-c(2007/05/17, 2007/05/16, 2007/05/16, 2007/05/16,2007/05/16, 2010/05/16)
Df<-data.frame(Sample_ID,Entry_date,Death_date)
I want a column called Df$survival_days as outcome variable like following
Sample_ID Entry_date Death_date Df$survival_days
-1062.00
-441.00
-660.00
-464.00
-770.00
468.00
How can i do this in R. I need this variable for my cox. regression survival analysis . My real data frame is having around 10,000 observations.
Use difftime
with appropriate units and provide dates as strings:
Sample_ID<-c("a1","a2","a3","a4","a5","a6")
Entry_date<-c("2010/04/13", "2008/07/30", "2009/03/06", "2008/08/22", "2009/06/24", "2008/08/26")
Death_date<-c("2007/05/17", "2007/05/16", "2007/05/16", "2007/05/16","2007/05/16", "2010/05/16")
Df<-data.frame(Sample_ID,Entry_date,Death_date)
Df$difference_in_days <- difftime(Df$Death_date, Df$Entry_date, units = "days")
Output
> Df
Sample_ID Entry_date Death_date difference_in_days
1 a1 2010/04/13 2007/05/17 -1062.0000 days
2 a2 2008/07/30 2007/05/16 -441.0000 days
3 a3 2009/03/06 2007/05/16 -660.0417 days
4 a4 2008/08/22 2007/05/16 -464.0000 days
5 a5 2009/06/24 2007/05/16 -770.0000 days
6 a6 2008/08/26 2010/05/16 628.0000 days