Search code examples
rlinuxtimezonelubridatetzdata

unrecognized time zone


With a recent update on Ubuntu (23.10 mantic), my R no longer recognizes "US/Eastern".

sessionInfo()
# R version 4.3.2 (2023-10-31)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 23.10
# Matrix products: default
# BLAS:   /opt/R/4.3.2/lib/R/lib/libRblas.so 
# LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3;  LAPACK version 3.11.0
# locale:
#  [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8        LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8   
#  [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C           LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C   
# time zone: America/New_York
# tzcode source: system (glibc)
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base     
# other attached packages:
# [1] r2_0.10.0
# loaded via a namespace (and not attached):
#  [1] compiler_4.3.2  clipr_0.8.0     fastmap_1.1.1   cli_3.6.2       tools_4.3.2     htmltools_0.5.7 rmarkdown_2.25  knitr_1.45      xfun_0.41      
# [10] digest_0.6.34   rlang_1.1.3     evaluate_0.23  

lubridate::with_tz(Sys.time(), tzone = "US/Eastern")
# Warning in with_tz.default(Sys.time(), tzone = "US/Eastern") :
#   Unrecognized time zone 'US/Eastern'
# [1] "2024-03-18 13:49:56"

On a similarly-configured (R-wise) 22.04 jammy system, however, it works just fine.

sessionInfo()
# R version 4.3.2 (2023-10-31)
# Platform: x86_64-pc-linux-gnu (64-bit)
# Running under: Ubuntu 22.04.4 LTS
# Matrix products: default
# BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
# LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/libopenblasp-r0.3.20.so;  LAPACK version 3.10.0
# locale:
#  [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
# time zone: Etc/UTC
# tzcode source: system (glibc)
# attached base packages:
# [1] stats     graphics  grDevices utils     datasets  methods   base
# loaded via a namespace (and not attached):
# [1] compiler_4.3.2

lubridate::with_tz(Sys.time(), tzone = "US/Eastern")
# [1] "2024-03-18 09:49:19 EDT"

Why does a normally-recognized TZ become unusable?


This is true on the OS itself, not just in R:

$ TZ="America/New_York" date
Mon Mar 18 10:22:03 AM EDT 2024
$ TZ="US/Eastern" date
Mon Mar 18 02:22:07 PM  2024

(notice the missing TZ in the second output)


Solution

  • The debate over the use of "Country/Region" (e.g. "US/Eastern") as opposed to "Continent/City" ("America/New_York") is not new. There is less ambiguity in the latter, where geopolitical forces can change the meaning of the former. So far (and still, afaict), the stance has been to maintain backward compatibility.

    However, when tzdata 2024 was released, on Ubuntu 23.10 the package (2024a-0ubuntu0.23.10) does not include the US/ symlinks; the same package on Ubuntu 22.04 does contain the links (2024a-0ubuntu0.22.04)

    Based on https://bugs.launchpad.net/ubuntu/+source/tzdata/+bug/2058249, the proper (and intended) fix is to install the tzdata-legacy linux package (and then restart R).

    My first solution/hack is below, written before I learned about the tzdata-legacy package (above). The hack was easy enough given that I have root access to the underlying filesystem. Unless you are loath to installing the extra package for some reason, you should likely go with tzdata-legacy instead. (These symlinks are the few that I wanted, the tzdata-legacy package has another 675 symlinks/files. The package split affects a lot more than just US/*, after all.)

    mkdir /usr/share/zoneinfo/US
    cd /usr/share/zoneinfo/US
    ln -s ../America/Anchorage Alaska
    ln -s ../America/Adak Aleutian
    ln -s ../America/Phoenix Arizona
    ln -s ../America/Chicago Central
    ln -s ../America/New_York Eastern
    ln -s ../America/Indiana/Indianapolis East-Indiana
    ln -s ../Pacific/Honolulu Hawaii
    ln -s ../America/Indiana/Knox Indiana-Starke
    ln -s ../America/Detroit Michigan
    ln -s ../America/Denver Mountain
    ln -s ../America/Los_Angeles Pacific
    ln -s ../Pacific/Pago_Pago Samoa
    

    After that, restart R ("should not" require reinstalling lubridate or timechange R packages) and it should then work. (I don't use RStudio, but you may need to restart that as well ... feedback on this is welcome.)

    lubridate::with_tz(Sys.time(), tzone = "US/Eastern")
    # [1] "2024-03-18 09:55:08 EDT"
    

    And in a shell (outside of R) as well:

    $ TZ="US/Eastern" date
    Mon Mar 18 10:23:11 AM EDT 2024