Search code examples
rfor-loopcoercion

as.Date in for loop performing unexpectedly


Why does this (admittedly unorthodox) use of as.Date within a for loop produce unexpected results?

I have the following date vector:

df.1 <- c("30-Sep-12", "30-Nov-12", "30-Sep-12", 
  "30-Nov-12", "30-Sep-12", "30-Nov-12", 
  "30-Sep-12")

Now of course to get them in the standard date format, I can use

df.date <- as.Date(df.1, format="%d-%b-%y")

But in the context of my script I wanted to use a for loop:

as.Date(df.1[6], format="%d-%b-%y")  # "30-Sep-12"
# [1] 2012-11-30  # as expected

df.for <- df.1
for (i in seq_along(df.1)){
df.for[i] <- as.Date(df.1[i], format="%d-%b-%y")
}
df.for[6]
# [1] 15674  # unexpected

Solution

  • A single atomic vector can only be of a single class

    When you use [<- to replace a single value of df.for, R can't hold those values you have not changed as "character" variables that look like Dates, and a Date class value (a number which is formated and displayed like a character). Therefore it coerces to character.

    you could get around this by making df.for a list

    eg

    df.for <- as.list(df.1)
    for (i in seq_along(df.1)){
      df.for[[i]] <- as.Date(df.1[i], format="%d-%b-%y")
    }
    

    Or by coercing the results back to a Date at the end of the loop (via numeric)

    eg

    df.for <- df.1
    for (i in seq_along(df.1)){
      df.for[i] <- as.Date(df.1[i], format="%d-%b-%y")
    }
    
    as.Date(as.numeric(df.for),origin = '1970-01-01')