Search code examples
rreshapedate-arithmetic

Calculating elapsed time for different interview dates in R


So my data looks like this

dat<-data.frame(
subjid=c("a","a","a","b","b","c","c","d","e"),
type=c("baseline","first","second","baseline","first","baseline","first","baseline","baseline"),
date=c("2013-02-07","2013-02-27","2013-04-30","2013-03-03","2013-05-23","2013-01-02","2013-07-23","2013-03-29","2013-06-03"))

i.e)

  subjid     type       date
1      a baseline 2013-02-07
2      a    first 2013-02-27
3      a   second 2013-04-30
4      b baseline 2013-03-03
5      b    first 2013-05-23
6      c baseline 2013-01-02
7      c    first 2013-07-23
8      d baseline 2013-03-29
9      e baseline 2013-06-03

and I'm trying to make a variable "elapsedtime" that denotes the time elapsed from the baseline date to first and second round interview dates (so that elapsedtime=0 for baselines). Note that it varies individually whether they have taken further interviews.

I tried to reshape the data so that I could subtract each dates but my brain isn't really functioning today--or is there another way?

Please help and thank you.


Solution

  • Screaming out for ave:

    I'll throw an NA value in there just for good measure:

    dat<-data.frame(
    subjid=c("a","a","a","b","b","c","c","d","e"),
    type=c("baseline","first","second","baseline","first","baseline","first","baseline","baseline"),
    date=c("2013-02-07","NA","2013-04-30","2013-03-03","2013-05-23","2013-01-02","2013-07-23","2013-03-29","2013-06-03"))
    

    And you should probably sort the data to be on the safe side:

    dat$type <- ordered(dat$type,levels=c("baseline","first","second","third") )
    dat <- dat[order(dat$subjid,dat$type),]
    

    Turn your date into a proper Date object:

    dat$date <- as.Date(dat$date)
    

    Then calculate the differences:

    dat$elapsed <- ave(as.numeric(dat$date),dat$subjid,FUN=function(x) x-x[1] )
    
    #  subjid     type       date  elapsed
    #1      a baseline 2013-02-07        0
    #2      a    first       <NA>       NA
    #3      a   second 2013-04-30       82
    #4      b baseline 2013-03-03        0
    #5      b    first 2013-05-23       81
    #6      c baseline 2013-01-02        0
    #7      c    first 2013-07-23      202
    #8      d baseline 2013-03-29        0
    #9      e baseline 2013-06-03        0