Search code examples
rcoerciontype-coercion

Strange as.Date() behavior


I'm using R 4.2.1 with all packages updated to the latest version. The two lines below differ only in the order of the elements in a concatenated vector, yet the output is completely different.

as.Date(c(Sys.Date(), "2020-09-09"))
as.Date(c("2020-09-09", Sys.Date()))

The output is:

> as.Date(c(Sys.Date(), "2020-09-09"))
[1] "2022-09-16" "2020-09-09"
> as.Date(c("2020-09-09", Sys.Date()))
[1] "2020-09-09" NA 

The first line correctly coerces the system date as a string, and the second line coerces it first as a numeric value and then as a string, but I have never before run into a situation where coercion in R depends on the order of elements in a vector...

Can someone explain to me why coercion rules behave this way and where I can read more about it...

And what can I do in a situation when the type of elements inside c() is not known a priori?

Thank you!


Solution

  • The default c() unclasses each argument before combining them (unclass(Sys.Date()) is 19251 [as of today]); this is because "all attributes except names are removed" by (at least the default version of) c(), which includes the class.

    The reason for the difference in orders is that c() is an S3 generic function, which means that it dispatches on the class of its first argument, so c(<date>, <character>) calls c.Date(), while c(<character>, <date>) calls the generic version of c() (which falls through to a primitive function in C which I don't want to bother digging through).

    The code of c.Date:

    function (..., recursive = FALSE) 
    .Date(c(unlist(lapply(list(...), function(e) unclass(as.Date(e))))))
    

    in other words, it coerces everything to a date, then unclasses it, then turns the vector back to dates once everything is concatenated ...

    A possible workaround/solution is to call c.Date() explicitly, if you know that's what you want ...