Search code examples
rdatetimeposixct

Parse ISO 8601 date-time in format YYYY-MM-DDTHH-MM-SSZ


I have a large dataframe with time stamps that look like this:

"2019-05-15T01:42:15.072Z"

It resembles a ISO 8601 combined date and time representation.

How can I parse this string into a real date-time format?

The characters (T and Z) inside the data seems to make it difficult.


Solution

  • You can simply parse the timestamp by specifying the format in as.POSIXct (or strptime)

    as.POSIXct("2019-05-15T01:42:15.072Z", format = "%Y-%m-%dT%H:%M:%OSZ", tz = "UTC")
    #[1] "2019-05-15 01:42:15 UTC"
    

    Explanation:

    %Y, %m and %d denote the year (with century), month and day; %H, %M and %OS denote the hours, minutes and seconds (including milliseconds). The T and Z are simply added to the format string, because

    Any character in the format string not part of a conversion specification is interpreted literally

    See ?strptime for the different conversion specifications.

    A comment on timezones

    As the Z denotes UTC times, we have manually added tz = "UTC" to as.POSIXct (as pointed out by @BennyJobigan). If you wanted the timestamp to be converted to your local (target) timezone, you can do

    # In timezone of target, i.e. convert from UTC to local
    lubridate::with_tz(
        as.POSIXct("2019-05-15T01:42:15.072Z", format = "%Y-%m-%dT%H:%M:%OSZ", tz = "UTC"),
        tz = Sys.timezone())
    # [1] "2019-05-15 11:42:15 AEST"
    

    (Obviously the output depends on your local timezone and might be different from what I get.)