Search code examples
javascriptunicodelocalizationcurrencyicu

Differences in currency formatting using Number.toLocaleString()


I've been looking into locale aware number formatting for javascript and found that Number.toLocaleString and by extension Intl.NumberFormat appear to be good solutions for this problem.

In particular I'd prefer building on top of already implemented abstractions for locale specific formatting rather than reinventing the wheel and coming up with one more solution to an already solved problem.


So I've called Number.toLocaleString on some different javascript environments and found that currency formatting seems to have changed:

(5).toLocaleString('fr-CH', {currency: 'CHF', style: 'currency'});
// Node v10.15.1:  'CHF 5.00'
// Node v12.1.0:   '5.00 CHF'
// Firefox 66.0.2: '5.00 CHF'
// Chrome 73.0.…:  '5.00 CHF'
// Safari 12.0.3:  '5.00 CHF'
// IE 11:          '5.00 fr.'
  • IE 11 is different than the rest, but it doesn't surprise me given its age.
  • What surprises me is that the formatting for CHF in fr-CH seems to have changed between node versions 10 and 12.
  • For comparison I had a look at the glibc LC_MONETARY settings for fr_CH and found that it seems to place the CHF before the amount at least about 1997. This makes it particularly confusing that the position of CHF seems to be different for most current browsers.

I would like to know and understand:

  1. Why are the positions of the CHF different in these cases?
    • I know that this can depend on the available system locales or the browser. But the change between node versions seems to indicate a more recent and voluntary change to me.
  2. Is there a correct way to place the CHF or are both choices acceptable for CH, or more specifically fr-CH?
    • For this it would be beautiful to have an actual source like a paper or research database rather than hearsay or anecdotes.

Update (2019-05-16):

In reaction to my partial answer I'd like to specify:

  • The formatting decision for fr_CH is given as currencyFormat{"#,##0.00 ¤ ;-#,##0.00 ¤"} in commit 3bfe134 but I'm still missing a source for the decision and would love to know about it.

Solution

  • So I've checked out the v8 source to see if I can find where the behavior of Number.toLocaleString is defined.

    • In builtins-number.cc I found the BUILTIN(NumberPrototypeToLocaleString){…} which uses Intl::NumberToLocaleString(…).
    • This led me to intl-objects.cc which implements the Intl::NumberToLocaleString using an icu::number::LocalizedNumberFormatter.

    Since v8 uses icu I checked out the source to continue my search.

    • My first tries to find the source of the number formatting led me to look at decimfmt and numfmt first, but I somehow kept loosing the trace.
    • Then it dawned on me that it would likely make sense to keep the format definitions somewhat separate from the rest of the code. By looking around the website and the source more I finally found icu4c/source/data/locales/de_CH.txt and icu4c/source/data/locales/fr_CH.txt.
      • de_CH.txt has currencyFormat{"¤ #,##0.00;¤-#,##0.00"}.
      • fr_CH.txt has currencyFormat{"#,##0.00 ¤ ;-#,##0.00 ¤"}.
    • Now using git I found the commit that first introduced the currencyFormat for fr_CH (3bfe134) 19 months ago.
      • This is plausible to be between node v10 and v12.
      • I can also see that it it would make sense to fallback on de_CH before the curreencyFormat was added to fr_CH and therefore see that the format would change the way it did.

    The commit mentions CLDR 32 alpha, and I found the CLDR charts version 32. However I'm currently not able to figure out where the chart is located that defines the currencyFormat for fr_CH.

    I feel that by finding the change to the fr_CH currencyFormat I found and understand the change that leads to the change of behavior between different node versions.

    As of now I don't understand why glibc and icu have differences here, but that is something I can ask around in the context of the specific projects for.

    I'm under the impression that I'm still missing the specific decision or data-point which led to the currencyFormat implementation - if I find it I shall add it here and be satisfied.

    Update 2019-05-18:

    • The CLDR 32 data can be fond in the download section under cldr.unicode.org.
      • From there I could download the cldr-common-32.zip which included the file common/main/fr_CH.xml in which the currency format is defined like this:
    <currencyFormats numberSystem="latn">
      <currencyFormatLength>
        <currencyFormat type="standard">
          <pattern draft="contributed">#,##0.00 ¤ ;-#,##0.00 ¤</pattern>
        </currencyFormat>
      </currencyFormatLength>
    </currencyFormats>
    

    Screenshot of number formatting decisions for fr_CH

    Update 2019-05-21:

    So out of curiosity I've asked about this on the libc-locales list as well as on the closest ticket I could find on the unicode-org ticket system.

    This prompted me to investigate further and when researching this with a friend we stumbled upon the cldr repo on Github which is focused on CLDR data rather than having CLDR related data in it like icu has.

    We found that commit c5e7787 introduced the first change that led to the CHF being placed after the number rather than before it and trough that commit became better aware of two tickets. These tickets are CLDR-9370 and CLDR-10755, the second of which is a follow up that clears up some formatting.

    While on the surface CLDR-9370 seems to mostly discuss the decimal separator, the currency symbol placement is discussed as well.

    One of the sources given is a typography guide (pdf) published by the CERN which gives detailed instructions on the ways to write numbers.

    For CHF the guide notes: Screenshot from the CERNs typography guide, the section on writing sums of money

    Using google translate this translates to:

    Writing sums of money

    The number is written in three-digit increments separated by a non-breaking space (no point or apostrophe of separation), and is followed (and never preceded) by the indication of the currency is long or abbreviated. For name abbreviations currency, we use the ISO code.