Search code examples
rdatetimeposixct

Trouble dealing with POSIXct timezones and truncating the time out of POSIXct objects


I have the following piece of R-code:

formatString = "%Y-%m-%d %H:%M:%OS"
x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString))
y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString))

I have the problem that when I do as.Date(y) then I get 2015-07-16 (although its date is one day later!). Apparently the problem is the timezone. So I checked the timezones:

> x
[1] "2013-11-23 23:10:38 CET"
> y
[1] "2015-07-17 01:43:38 CEST"
> 

Ok, so they deviate in their timezone. This is weird, because why does R decide that one timestamp (given without any timezone at all) lies in a different timezone than another (given without any timezone at all)?

Ok, so lets set the timezone. Googling revealed that attr(y, "tzone") <- "CET" should do the deal. Lets try this:

> attr(y, "tzone") <- "CET"
> y
[1] "2015-07-17 01:43:38 CEST"
> 

Ok, that did not work. Let see what the timezone actually is in the beginning:

> formatString = "%Y-%m-%d %H:%M:%OS"
> x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString))
> y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString))
> unclass(x)
[1] 1385244638
attr(,"tzone")
[1] ""
> unclass(y)
[1] 1437090218
attr(,"tzone")
[1] ""
> 

So... they dont have a timezone at all but their timezones are different????

--> here are my natural questions:

1) why are they initialized with a different timezone when I dont specify a timezone at all?

2) why do both objects apparently not have a timezone and at the same time... how come they have different timezones?

3) How can I make as.Date(y) == "2015-07-17" true? I.e. how can I set both to the current timezone? Sys.timezone() results in 'NA'... (EDIT: my timezone [Germany] seems to be "CET" --> how can I set both to CET?)

I'm scratching my head here... Thanks for any thoughts on this you share with me :-)

FW


Solution

  • If you don't specify a timezone then R will use your system's locale as POSIXct objects must have a timezone. The difference between CEST and CET is that one is summertime and one is not. That means if you define a date during the part of the year defined as summertime then R will decide to use the summertime version of the timezone. If you want to set dates that don't use summertime versions then define them as GMT from the beginning.

    formatString = "%Y-%m-%d %H:%M:%OS"
    x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString), tz="GMT")
    y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString), tz="GMT")
    

    If you want to truncate out the time, don't use as.Date on a POSIXct object since as.Date is meant to convert character objects to Date objects (which aren't the same as POSIXct objects). If you want to truncate POSIXct objects with base R then you'll have to wrap either round or trunc in as.POSIXct but I would recommend checking out the lubridate package for dealing with dates and times (specifically POSIXct objects).

    If you want to keep CET but never use CEST you can use a location that doesn't observe daylight savings. According to http://www.timeanddate.com/time/zones/cet your only options are Algeria and Tunisia. According to https://en.wikipedia.org/wiki/List_of_tz_database_time_zones the valid tz would be "Africa/Algiers". Therefore you could do

     formatString = "%Y-%m-%d %H:%M:%OS"
    x = as.POSIXct(strptime("2013-11-23 23:10:38.000000", formatString), tz="Africa/Algiers")
    y = as.POSIXct(strptime("2015-07-17 01:43:38.000000", formatString), tz="Africa/Algiers")
    

    and both x and y would be in CET.

    One more thing about setting timezones. If you tell R you want a generic timezone then it won't override daylight savings settings. That's why setting attr(y, "tzone") <- "CET" didn't have the desired result. If you did attr(y, "tzone") <- "Africa/Algiers" then it would have worked as you expected. Do be careful with conversions though because when you change the timezone it will change the time to account for the new timezone. The package lubridate has the function force_tz which changes the timezone without changing the time for cases where the initial timezone setting was wrong but the time was right.