I have to deals with SimpleDateFormat but I have issue with year of week values.
To narrow down the problem, I wrote the simple Java code below and found that it returns two different results with apparently the same settings (just by forcing local on command line). The problem is only with a Windows (US configured) machine: if I run the same test on a Linux (CentOS) machine, everything is ok.
JVM on Windows is zulu8 1.8.0_282 openjdk (but it seems I've the same behavior with oracle 8 jdk) while it's Red Hat 1.8.0_272 openjdk on Linux.
Here is the source code :
import java.util.Locale;
import java.util.Calendar;
import java.util.TimeZone;
import java.text.SimpleDateFormat;
import java.text.DateFormat;
import java.text.ParseException;
import java.time.LocalDate;
import java.time.temporal.WeekFields;
public class TestDate {
public static void main(String args[]) throws ParseException {
Locale currentLocale = Locale.getDefault();
System.out.println(System.getProperty("java.vendor"));
System.out.println(System.getProperty("java.version"));
System.out.println("==============");
System.out.printf("%20s = %s%n", "getDisplayLanguage", currentLocale.getDisplayLanguage());
System.out.printf("%20s = %s%n", "getDisplayCountry", currentLocale.getDisplayCountry());
System.out.printf("%20s = %s%n", "getDisplayVariant", currentLocale.getDisplayVariant());
System.out.printf("%20s = %s%n", "getLanguage", currentLocale.getLanguage());
System.out.printf("%20s = %s%n", "getCountry", currentLocale.getCountry());
System.out.printf("%20s = %s%n", "user.country", System.getProperty("user.country"));
System.out.printf("%20s = %s%n", "user.language", System.getProperty("user.language"));
System.out.printf("%20s = %s%n", "user.variant", System.getProperty("user.variant"));
System.out.println("==============");
Calendar c = Calendar.getInstance();
System.out.println("1st day of week / minimal days in 1st week : " + c.getFirstDayOfWeek() + " / " + c.getMinimalDaysInFirstWeek());
System.out.println("==============");
LocalDate date1 = LocalDate.of(2020, 12, 31);
LocalDate date2 = LocalDate.of(2021, 1, 1);
DateFormat df_date = new java.text.SimpleDateFormat("dd/MM/yyyy");
DateFormat df_week = new java.text.SimpleDateFormat("YYYY-ww");
System.out.printf("%20s | %10s | %10s%n", "", df_date.format(java.sql.Date.valueOf(date1)), df_date.format(java.sql.Date.valueOf(date2)));
System.out.printf("%20s | %10s | %10s%n", "SimpleDateFormat", df_week.format(java.sql.Date.valueOf(date1)), df_week.format(java.sql.Date.valueOf(date2)));
System.out.printf("%20s | %7d-%02d | %7d-%02d%n", "WeekFields",
date1.get(WeekFields.ISO.weekBasedYear()), date1.get(WeekFields.ISO.weekOfWeekBasedYear()),
date2.get(WeekFields.ISO.weekBasedYear()), date2.get(WeekFields.ISO.weekOfWeekBasedYear()));
}
}
And here are the results (the second one is the expected one):
>java TestDate
Azul Systems, Inc.
1.8.0_282
==============
getDisplayLanguage = English
getDisplayCountry = United States
getDisplayVariant =
getLanguage = en
getCountry = US
user.country = US
user.language = en
user.variant =
==============
1st day of week / minimal days in 1st week : 2 / 4
==============
| 31/12/2020 | 01/01/2021
SimpleDateFormat | 2020-53 | 2020-53
WeekFields | 2020-53 | 2020-53
>java -Duser.language=en -Duser.country=US -Duser.variant= TestDate
Azul Systems, Inc.
1.8.0_282
==============
getDisplayLanguage = English
getDisplayCountry = United States
getDisplayVariant =
getLanguage = en
getCountry = US
user.country = US
user.language = en
user.variant =
==============
1st day of week / minimal days in 1st week : 1 / 1
==============
| 31/12/2020 | 01/01/2021
SimpleDateFormat | 2021-01 | 2021-01
WeekFields | 2020-53 | 2020-53
Both seems to use the same locale settings but SimpleDateFormat returns different week/year of week. Am I missing some locale settings?
Thank you for your help.
EDIT with Oracle JDK :
>java TestDate
Oracle Corporation
1.8.0_202
==============
getDisplayLanguage = English
getDisplayCountry = United States
getDisplayVariant =
getLanguage = en
getCountry = US
user.country = US
user.language = en
user.variant =
==============
1st day of week / minimal days in 1st week : 2 / 4
==============
| 31/12/2020 | 01/01/2021
SimpleDateFormat | 2020-53 | 2020-53
WeekFields | 2020-53 | 2020-53
>java -Duser.language=en -Duser.country=US -Duser.variant= TestDate
Oracle Corporation
1.8.0_202
==============
getDisplayLanguage = English
getDisplayCountry = United States
getDisplayVariant =
getLanguage = en
getCountry = US
user.country = US
user.language = en
user.variant =
==============
1st day of week / minimal days in 1st week : 1 / 1
==============
| 31/12/2020 | 01/01/2021
SimpleDateFormat | 2021-01 | 2021-01
WeekFields | 2020-53 | 2020-53
EDIT Calendar default Locale :
As pointed out by Scratte, Calendar and SimpleDateFormat use a default Locale. I had a look on SimpleDateFormat source code and it uses Locale.getDefault(Locale.Category.FORMAT)
as default Local which turns out to be different from the Locale.getDefault()
I used in my code.
I finally have understood why I had 2 different behavior between both code: I did not display the correct Locale (I was not aware of the 3 distincts Locale ; thank you Ole V.V. for clarifying this).
TL;DR
SimpleDateFormat
uses Locale.getDefault(Locale.Category.FORMAT)
and my Java code was displaying values of Locale.getDefault()
.
The later was always en_US
but the former was fr_FR
or en_US
depending on the command line I used. That's why I had two different output for the week / year.
Finally, JVM parameters -Duser.language= / -Duser.country= / -Duser.variant=
are the solution (they force all the three different Locale)!
This new code shows the difference of the three different Locale:
import java.sql.Date;
import java.util.Locale;
import java.util.Calendar;
import java.util.TimeZone;
import java.text.SimpleDateFormat;
import java.text.DateFormat;
import java.text.ParseException;
import java.time.LocalDate;
import java.time.temporal.WeekFields;
public class TestDate {
public static void main(String args[]) throws ParseException {
Locale cL = Locale.getDefault();
Locale cLD = Locale.getDefault(Locale.Category.DISPLAY);
Locale cLF = Locale.getDefault(Locale.Category.FORMAT);
System.out.println(System.getProperty("java.vendor"));
System.out.println(System.getProperty("java.version"));
System.out.println("==============");
System.out.printf("%20s | %15s | %15s | %15s%n", "Locale.getDefault(.)", "", "DISPLAY", "FORMAT");
System.out.printf("%20s | %15s | %15s | %15s%n", "getDisplayLanguage", cL.getDisplayLanguage(), cLD.getDisplayLanguage(), cLF.getDisplayLanguage());
System.out.printf("%20s | %15s | %15s | %15s%n", "getDisplayCountry", cL.getDisplayCountry(), cLD.getDisplayCountry(), cLF.getDisplayCountry());
System.out.printf("%20s | %15s | %15s | %15s%n", "getDisplayVariant", cL.getDisplayVariant(), cLD.getDisplayVariant(), cLF.getDisplayVariant());
System.out.printf("%20s | %15s | %15s | %15s%n", "getLanguage", cL.getLanguage(), cLD.getLanguage(), cLF.getLanguage());
System.out.printf("%20s | %15s | %15s | %15s%n", "getCountry", cL.getCountry(), cLD.getCountry(), cLF.getCountry());
System.out.printf("%20s | %15s | %15s | %15s%n", "getVariant", cL.getVariant(), cLD.getVariant(), cLF.getVariant());
System.out.printf("%20s = %s%n", "user.country", System.getProperty("user.country"));
System.out.printf("%20s = %s%n", "user.language", System.getProperty("user.language"));
System.out.printf("%20s = %s%n", "user.variant", System.getProperty("user.variant"));
System.out.println("==============");
Calendar c = Calendar.getInstance();
System.out.println("1st day of week / minimal days in 1st week : " + c.getFirstDayOfWeek() + " / " + c.getMinimalDaysInFirstWeek());
System.out.println("==============");
LocalDate date1 = LocalDate.of(2020, 12, 31);
LocalDate date2 = LocalDate.of(2021, 1, 1);
DateFormat df_date = new java.text.SimpleDateFormat("dd/MM/yyyy");
DateFormat df_week = new java.text.SimpleDateFormat("YYYY-ww");
System.out.printf("%20s | %10s | %10s%n", "", df_date.format(java.sql.Date.valueOf(date1)), df_date.format(java.sql.Date.valueOf(date2)));
System.out.printf("%20s | %10s | %10s%n", "SimpleDateFormat", df_week.format(java.sql.Date.valueOf(date1)), df_week.format(java.sql.Date.valueOf(date2)));
System.out.printf("%20s | %7d-%02d | %7d-%02d%n", "WeekFields",
date1.get(WeekFields.ISO.weekBasedYear()), date1.get(WeekFields.ISO.weekOfWeekBasedYear()),
date2.get(WeekFields.ISO.weekBasedYear()), date2.get(WeekFields.ISO.weekOfWeekBasedYear()));
}
}
And the corresponding outputs :
>java TestDate
Azul Systems, Inc.
1.8.0_282
==============
Locale.getDefault(.) | | DISPLAY | FORMAT
getDisplayLanguage | English | English | French
getDisplayCountry | United States | United States | France
getDisplayVariant | | |
getLanguage | en | en | fr
getCountry | US | US | FR
getVariant | | |
user.country = US
user.language = en
user.variant =
==============
1st day of week / minimal days in 1st week : 2 / 4
==============
| 31/12/2020 | 01/01/2021
SimpleDateFormat | 2020-53 | 2020-53
WeekFields | 2020-53 | 2020-53
>java -Duser.language=en -Duser.country=US -Duser.variant= TestDate
Azul Systems, Inc.
1.8.0_282
==============
Locale.getDefault(.) | | DISPLAY | FORMAT
getDisplayLanguage | English | English | English
getDisplayCountry | United States | United States | United States
getDisplayVariant | | |
getLanguage | en | en | en
getCountry | US | US | US
getVariant | | |
user.country = US
user.language = en
user.variant =
==============
1st day of week / minimal days in 1st week : 1 / 1
==============
| 31/12/2020 | 01/01/2021
SimpleDateFormat | 2021-01 | 2021-01
WeekFields | 2020-53 | 2020-53
I have not understood how the implementation by Talend ETL can be any of your business. If they have not yet found the opportunity for upgrading to java.time, the modern Java date and time API, it’s their problem, not yours. You should not use SimpleDateFormat
nor Calendar
in your own code.
Java hasn’t just got one, it’s got three default locales, partly for historical reasons. They can be set individually. To demonstrate:
Locale.setDefault(Locale.FRANCE);
Locale.setDefault(Locale.Category.DISPLAY, Locale.JAPAN);
Locale.setDefault(Locale.Category.FORMAT, Locale.GERMANY);
System.out.println(Locale.getDefault());
System.out.println(Locale.getDefault(Locale.Category.DISPLAY));
System.out.println(Locale.getDefault(Locale.Category.FORMAT));
Output from this snippet is:
fr_FR ja_JP de_DE
The output reflects in order France, Japan and Germany (deutsch/Deutschland).
Your comment states that the code of SimpleDateFormat
uses the default FORMAT locale as its default locale (so Germany in my example). That is, the locale that it uses when you don’t specify one (you should’t use SimpleDateFormat
, if you do nevertheless, you should always specify locale explicitly).
As I said, the three can be set individually. The one-arg Locale.setDefault()
sets all three, though.
Does this observation explain? On my Java 11 it seems that setting the locale on the command line sets all three default locales (until altered by Locale.setDefault()
). I tried just
System.out.println(Locale.getDefault());
System.out.println(Locale.getDefault(Locale.Category.DISPLAY));
System.out.println(Locale.getDefault(Locale.Category.FORMAT));
I ran this snippet with -Duser.language=en -Duser.country=US
on the command line, and the output was:
en_US en_US en_US
Also other language and country setting came through in all three locales. So no, this doesn’t alone explain why your SimpleDateFormat
in one case did not seem to pick up the locale from the command line.
Does this observation provide a solution?
I still haven’t understood what your real end goal is. The first recommendation is: Your code should not rely on the default locale of the JVM. Use explicit locale in your locale sensitive operations.
If you do need to set the default FORMAT locale for Talend ETL to work the way you require it to, Locale.setDefault(Locale.Category.FORMAT, Locale.US);
should do it.
Related question: Which "default Locale" is which?