Search code examples
rdatelubridatesubtraction

Taking the Differences Between 2 Dates (R)


I am trying to take the difference between two dates in R. Normally, I could have just done this in MS Excel, but I want to try and learn something new by doing this in R. The dates are in the form : Year-Month-Date and are in "factor" format:

# how the data looks like
d <- data.frame (

"day_a" = c("2010-12-25", "2020-10-31"),
"day_b" = c("2011-12-24", "2021-01-01")

)

d$day_a = as.factor(d$day_a)
d$day_b = as.factor(d$day_b)

First, I tried to do a straightforward subtraction :

   #my first attempt to take the difference in days
    d$day_1 = as.Date(d$day_a, "%Y/%m/%d")
    d$day_2 = as.Date(d$day_b, "%Y/%m/%d")
    
    d$diff = d$day_1 - d$day_2

Then, I tried to use the "lubdridate" library:

library(lubridate)
d$diff=interval(ymd(d$day_1),ymd(d$day_2))

However, this also did not work.

Could someone please tell me what I am doing wrong?

Thanks


Solution

  • You need a character input for dates. Also, you were using %Y/%m/%d and you defined in d data as Y-m-d so that the date conversion would work fine:

    d <- data.frame (
      
      "day_a" = c("2010-12-25", "2020-10-31"),
      "day_b" = c("2011-12-24", "2021-01-01")
      
    )
    
    d$day_a = as.factor(d$day_a)
    d$day_b = as.factor(d$day_b)
    
    d$day_1 = as.Date(as.character(d$day_a))
    d$day_2 = as.Date(as.character(d$day_b))
    
    d$diff = d$day_1 - d$day_2
    

    Output:

    d
           day_a      day_b      day_1      day_2      diff
    1 2010-12-25 2011-12-24 2010-12-25 2011-12-24 -364 days
    2 2020-10-31 2021-01-01 2020-10-31 2021-01-01  -62 days