Search code examples
rdatedifftime

Difftime between rows inconsistent


I want to calculate the day-difference between two rows (per ID) with difftime. At the beginning I get the right results but in some lines there are inconsistent values:

PatId Date Tage
3l 2015-02-10 NA
3l 2015-03-30 48
3l 2015-06-03 65

...

5r 2016-02-02 NA
5r 2016-03-01 62
5r 2016-03-29 -469

this is my function:

setDT(AllPat)[, Tage := difftime(AllPat$Date, shift(AllPat$Date), units = "days"), by = PatID]

I tried it already with tz="GMT", but it doesn't change, maybe someone has an idea?

Have anyone an idea how I can change the function to calculate the difference and write it always in the first row, so that the last row has only a NA


Solution

  • You can not call AllPat$Date in the data table command and expect your program to split it by PatID. Your program currently tries to use the whole vector AllPat$Date for every ID. You need to refer to the Date variable in the data table so that the program can work with Date and PatID together if that makes sense.

    I would convert your data.frame to a data.table first to preserve as much as possible your code

    dtAllPat=as.data.table(AllPat)
    dtAllPat[, Tage := difftime(Date, shift(Date), units = "days"), keyby = .(PatID)]