Search code examples
c++icu

How to parse string representation of number where decimal separator is colon :


I'm using ICU in C++ program to parse decimal values from text that may contain various locale and application specific representations of decimal values. First layer of the code applies regular expression to zoom in on relevant part in text and then I need to de-format the string that (in this case) contains decimal value.

In the configuration we can set the locale (eg "en", "sv") and the ICU decimal format that should be applied. For instance, "en" and format ###,###.## will parse US English style number "123,456.70" with following code (error handling omitted):

UErrorCode error = U_ZERO_ERROR;
icu::Formattable binvalue( 0.0 );
icu::Locale iculocale( locale_as_string ); // e.g "sv", "en"
icu::DecimalFormatSymbols symbols( iculocale, error );
icu::DecimalFormat format( pattern, symbols, error ); // pattern as "###,###.##"
format.parse( value, binvalue, error );

Now I need to parse input string 123:70 that obviously has custom format. I have no control over application producing the file. The locale of the input document is "sv" (Swedish) but that is probably not important.

Is it possible to make this work by specifying a locale and ICU numeric pattern, someting like ###,###':'## (which is apparently not correct example). From the UI of my application I can (for now) only set the pattern and locale. So I have no way to customize the DecimalFormatSymbols.

Unfortunately I do not find much interesting to read about ICU decimal format other than DecimalFormat

In short, can I override the decimal separator via pattern string? Is there a pattern and locale combination that would parse "123:70" and return me decimal value 123.7 ?


Solution

  • Your symbols object has a method setSymbol, which you probably want to call.

    symbols.setSymbol(icu::DecimalFormatSymbols::kDecimalSeparatorSymbol, ':');
    

    The format string is not locale specific, so you would continue with the pattern ###,###.##