Search code examples
rdatetype-conversionposix

Why is R adding an extra row when converting characters to dates


I am converting a vector of character data types to date data types in R using strptime.

When I used sapply to check the data types after the conversion it gave me back an extra row.

Minimal example below:

test_dates = c("2020-10-01","2019-08-09","2018-07-01")
sapply(test_dates,class)
2020-10-01  2019-08-09  2018-07-01 
"character" "character" "character" 

test_dates = strptime(test_dates, "%Y-%m-%d")
sapply(test_dates,class)
     [,1]      [,2]      [,3]     
[1,] "POSIXlt" "POSIXlt" "POSIXlt"
[2,] "POSIXt"  "POSIXt"  "POSIXt" 

The second row at the end is the bit that I am unsure about. I don't know if it is a misunderstanding of sapply, or to do with how R stores times/dates. As below, there is nothing in the second row of the data.

test_dates[1][1]
[1] "2020-10-01 BST"
test_dates[1][2]
[1] NA

Thanks in advance for any help.


Solution

  • R objects can have more than one class. The second row that you are seeing is because strptime returns object with two classes i.e POSIXlt and POSIXt. As you are using sapply it simplifies the data into a matrix which might be confusing.

    Maybe output of lapply would be less confusing.

    lapply(test_dates, class)
    
    #[[1]]
    #[1] "POSIXlt" "POSIXt" 
    
    #[[2]]
    #[1] "POSIXlt" "POSIXt" 
    
    #[[3]]
    #[1] "POSIXlt" "POSIXt" 
    

    Also a vector can have only one class so you can check class of the whole vector instead of each individual element since it will return the same value anyway.

    class(test_dates)
    #[1] "POSIXlt" "POSIXt"