Search code examples
rlocale

Unexpected behavior of Sys.setlocale


Please see the code below, I have to change my locale to be able to convert a date. My first attempt is unsuccessful, my second attempt works, though it seems redundant and doesn't change the output of Sys.getlocale.

My OS is Windows 7 64-bit

Sys.getlocale() # "LC_COLLATE=French_Belgium.1252;LC_CTYPE=French_Belgium.1252;LC_MONETARY=French_Belgium.1252;LC_NUMERIC=C;LC_TIME=French_Belgium.1252"
date <- "Dec-11"
as.Date(date, format = "%b-%d")     # NA
Sys.setlocale(locale = "UK")        # "LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252"
locale2 <- Sys.getlocale()
as.Date(date, format = "%b-%d")     # NA
Sys.setlocale("LC_TIME", "English_United Kingdom")
locale3 <- Sys.getlocale()          # "LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252"
as.Date(date, format = "%b-%d")     # "2017-12-11"
locale2 == locale3                  # TRUE

I can skip the first call to Sys.getlocale and the date conversion will work:

Sys.getlocale() # "LC_COLLATE=French_Belgium.1252;LC_CTYPE=French_Belgium.1252;LC_MONETARY=French_Belgium.1252;LC_NUMERIC=C;LC_TIME=French_Belgium.1252"
date <- "Dec-11"
as.Date(date, format = "%b-%d")     # NA
Sys.setlocale("LC_TIME", "English_United Kingdom") # 
locale4 <- Sys.getlocale()          # "LC_COLLATE=French_Belgium.1252;LC_CTYPE=French_Belgium.1252;LC_MONETARY=French_Belgium.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252"
as.Date(date, format = "%b-%d")     # "2017-12-11"

But this doesn't work :

Sys.getlocale() # "LC_COLLATE=French_Belgium.1252;LC_CTYPE=French_Belgium.1252;LC_MONETARY=French_Belgium.1252;LC_NUMERIC=C;LC_TIME=French_Belgium.1252"
date <- "Dec-11"
as.Date(date, format = "%b-%d")     # NA
Sys.setlocale(locale = "English_United Kingdom") #
locale5 <- Sys.getlocale()          # "LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United Kingdom.1252;LC_MONETARY=English_United Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252"
as.Date(date, format = "%b-%d")     # NA

This is related to this question : Converting integer format date to double format of date


Solution

  • As per answer of prof. dr. Brian Ripley :

    This is expected behaviour in Windows. On other systems, the underlying function for formatting strptime() uses the OS specific strptime function, but Windows doesn't have one. So R uses a substitute function in the case of non-english day or month names. As you have your standard locale on French, your R is set up to recognize french day and month names/abbreviations.

    This substitue function for strptime uses its own mapping of those day and month names, but this mapping is refreshed ONLY when "LC_TIME" is set specifically. At least this is the case for R 3.4.0 and earlier versions using the same mechanism.

    So contrary to my first impression, this is not a bug but a feature :-)