Search code examples
rtimedate-arithmetic

Difficulty performing arithmetic with dates in R


I am manipulating data containing dates and am having a bit of trouble. Essentially I wish to calculate a new date based on two existing dates and another variable, for all rows in my dataframe. For example, I would like to be able to subtract 10 days from Date1, or calculate the date that is midway between Date1 and Date2, etc. However I am having trouble understanding class assignment when adding the new calculated date to the dataframe. Sample dataframe:

#  Uncomment to clear your session...
# rm(list = ls(all = TRUE))
tC <- textConnection("StudyID   Date1   Date2
C0031   2-May-09    12-Jan-10
C0032   7-May-09    30-Apr-10")
data <- read.table(header=TRUE, tC)
close.connection(tC)
rm(tC)

#CONVERTING TO DATES    
data$Date1 <- with(data,as.Date(Date1,format="%d-%b-%y"))
data$Date2 <- with(data,as.Date(Date2,format="%d-%b-%y"))

Now here is where my problem begins

class(data[1, "Date2"] - 10) # class is "Date". So far so good. 
data[1, "newdate"]  <- (data[1, "Date2"] - 10)
class(data[1, "newdate"]) # class is now "numeric"... 

And tried

data[1, "newdate"]  <- as.Date(data[1, "Date2"] - 10)
class(data[1, "newdate"]) # doesn't help. Class still "numeric"... 

Just not understanding why this value becomes numeric when assigned to data


Solution

  • The problem is due to recycling of your vector stripping attributes. As I stated in my comment, use e.g. data$newdate <- data$Date1 - 10 to create the whole column without recycling the vector, thus retaining attributes such as Date. Consider the illustrative toy example below:

    # Simple vector with an attribute
    x <- 1:3
    attributes(x) <- list( att = "some attributes" )
    x
    #[1] 1 2 3
    #attr(,"att")
    #[1] "some attributes"
    
    # Simple data.frame with 3 rows
    df <- data.frame( a = 1:3 )
    
    #  New column using first element of vector with attributes
    df$b <- x[1]
    
    #  It is recycled to correct number of rows and attributes are stripped
    str(df$b)
    # int [1:3] 1 1 1
    
    #  Without recycling attributes are retained
    df$c <- x
    str(df$c)
    # atomic [1:3] 1 2 3
    # - attr(*, "att")= chr "some attributes"
    
    #  But they all look the same...
    df
    #  a b c
    #1 1 1 1
    #2 2 1 2
    #3 3 1 3
    

    And from your data..

    attributes(data$Date1)
    # $class
    # [1] "Date"