Search code examples
javatimezoneicu4j

How can I get the "current" IANA time zone abbreviation throughout time in ICU4J?


I'm currently trying to write a suite of time zone validation programs to see whether various platforms interpret the IANA time zone data.

The output format I'm targeting includes the abbreviation in effect for a particular time - such as "BST" for "British Summer Time", or "PST" for "Pacific Standard Time".

On most platforms, this is easy - but ICU4J seems not to be working, oddly. According to the SimpleDateFormat documentation I should be able to use a pattern of "zzz" to get what I'm looking for, but this seems to fall back to the "O" pattern of GMT+X for a lot of the time. For some time zones, there are no abbreviations at all.

Short example using New York:

import java.util.Date;
import java.util.Locale;
import com.ibm.icu.util.TimeZone;
import com.ibm.icu.text.SimpleDateFormat;

public class Test {
    public static void main(String[] args) {
        TimeZone zone = TimeZone.getTimeZone("America/New_York");
        SimpleDateFormat format = new SimpleDateFormat("zzz", Locale.US);
        format.setTimeZone(zone);

        // One month before the unix epoch
        System.out.println(format.format(new Date(-2678400000L))); // GMT-5

        // At the unix epoch
        System.out.println(format.format(new Date(0L))); // EST
    }
}

(I'm running using ICU4J 55.1, both the stock download and after updating it with the 2015e data release.)

It's not clear to me whether ICU4J is getting its abbreviations from the tz data or from CLDR - I suspect it's the latter, given that there's nothing in the tz data to suggest a difference here.

It also seems to be affected by locale, which I suppose is reasonable - using the US locale I can see EST/EDT for America/New_York, but nothing for Europe/London; with the UK locale I see GMT/BST for Europe/London, but nothing for America/New_York :(

Is there a way to persuade ICU4J to fall back to tz abbreviations? In my very specific case, that's all I'm looking for.

Update

Thanks to RealSkeptic's comments, it looks like TimeZoneNames is a cleaner way of getting this data without formatting. It all sounds so promising - there's even TimeZoneNames.getTZDBInstance:

Returns an instance of TimeZoneNames containing only short specific zone names (TimeZoneNames.NameType.SHORT_STANDARD and TimeZoneNames.NameType.SHORT_DAYLIGHT), compatible with the IANA tz database's zone abbreviations (not localized).

That's pretty much exactly what I want - but that doesn't go earlier than 1970 either in most cases, nor does it include all the relevant data:

import static com.ibm.icu.text.TimeZoneNames.NameType.SHORT_STANDARD;

import com.ibm.icu.text.TimeZoneNames;
import com.ibm.icu.text.TimeZoneNames.NameType;
import com.ibm.icu.util.ULocale;

public class Test {
    public static void main(String[] args) {
        TimeZoneNames names = TimeZoneNames.getTZDBInstance(ULocale.ROOT);

        long december1969 = -2678400000L;
        // 24 hours into the Unix epoch...
        long january1970 = 86400000L;

        // null
        System.out.println(
            names.getDisplayName("America/New_York",  SHORT_STANDARD, december1969));
        // EST
        System.out.println(
            names.getDisplayName("America/New_York",  SHORT_STANDARD, january1970));

        // null
        System.out.println(
            names.getDisplayName("Europe/London",  SHORT_STANDARD, december1969));
        // null
        System.out.println(
            names.getDisplayName("Europe/London",  NameType.SHORT_STANDARD, january1970));
    }
}

Given that there's really very little indirection at this point - I'm telling ICU4J exactly what I want - my suspicion is that the information just isn't available :(


Solution

  • Tracing through the sources to see how this works, it turns out that to find the display name, it gets the name of the meta zone from the zone name and the date, and then, from the meta zone and the type, the display name.

    com.ibm.icu.impl.TZDBTimeZoneNames, which is the class returned from TimeZoneNames.getTZDBInstance(ULocale), implements getMetaZoneID(String,Long) by calling com.ibm.icu.impl.TimeZoneNamesImpl._getMetaZoneID(String,long), which retrieves the mappings from the given time zone name to meta zone names, and then checks if the date is between the from and to parameters in any of those mappings.

    The mapping is read by a nested class, like this:

    for (int idx = 0; idx < zoneBundle.getSize(); idx++) {
        UResourceBundle mz = zoneBundle.get(idx);
        String mzid = mz.getString(0);
        String fromStr = "1970-01-01 00:00";
        String toStr = "9999-12-31 23:59";
        if (mz.getSize() == 3) {
            fromStr = mz.getString(1);
            toStr = mz.getString(2);
        }
        long from, to;
        from = parseDate(fromStr);
        to = parseDate(toStr);
        mzMaps.add(new MZMapEntry(mzid, from, to));
    }
    

    (source)

    As you can see, it has hard-coded values for the to and from values it will return (although it reads the to and from from the resource bundle itself when the meta zone entry has three items, most of them don't - as can be seen in the actual meta zone file from which the bundle is built - and those who do, also do not have 'from' dates before January 1970.)

    Thus, the meta zone ID will be null for any date before January 1970, and in turn, so will the display name.