In our C++ codebase, we have a default formatting method to convert double
floating point numbers into strings, that is used notably for JSON serialization and for debug logs. For that default number formatting, I have the following contradictory requirements:
1000
to 1e3
or 0.125
to 1.25e-1
.3.1415926535897931
to 3.14
.0.1
to 0.10000000000000001
.Up to now, the best tradeoff I found is to use the equivalent of printf("%.15g", value)
formatting. It fulfills requirements 1 and 3, but not completely 2. There is a loss of precision of about 4 bits.
Other people use a default formatting based on "%.17g"
, which fulfills requirements 1 and 2, but not 3. The number 0.2 is for example formatted as 0.20000000000000001
.
In between, the format "%.16g"
is close to fulfill requirements 2 and 3, but not always for both.
As an illustration, I wish 0.3 to be formatted as 0.3
, but 0.1+0.2, which is slightly bigger due to rounding errors, to be formatted as 0.30000000000000004
to see the difference.
I wrote the following function that format floating point numbers the way I wish, as a proof of concept. However it is unacceptable on the performance point of view, since it can make up to 34 conversions between double
and strings, for a limited precision gain over the current implementation with "%.15g"
.
std::string doubleToString(double number)
{
char buffer[32];
long long intVal = static_cast<long long>(number);
if(intVal == number)
{
sprintf(buffer, "%lld", intVal);
}
else
{
for(int i=1; i<=17; i++)
{
sprintf(buffer, "%.*g", i, number);
double readBack = atof(buffer);
if(readBack == number)
break;
}
}
return buffer;
}
I just realized that Python is already formatting numbers the way I want:
$ python3
Python 3.8.6 (default, Oct 8 2020, 14:06:32)
[Clang 12.0.0 (clang-1200.0.32.2)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> 0.3
0.3
>>> 0.1+0.2
0.30000000000000004
>>>
Is there a way to have the same behavior in C++ without sacrificing too much of performance ?
After Frodyne comment, I was able to figure out a very simple and fast solution.
The C++17 std::to_chars
function, by default, formats the floating point numbers to fulfill shortest round trip requirement. That mean that all distinct floating point numbers remain distinct after serialization, and the number of characters to format is minimized.
So the conversion can be written like this in standard C++17.
#include <charconv>
#include <string>
std::string doubleToString(double number)
{
char buffer[24];
std::to_chars_result err = std::to_chars(buffer, buffer+sizeof(buffer), value);
return std::string(buffer, err.ptr);
}
The great news from Microsoft lecture is that in addition to solve the shortest round-trip problem, the implementation in MSVC is blazing fast! It is based on the incredible Ryu algorithm.
The bad news is that as time of writing std::to_chars
is only available for floating point numbers in the Microsoft tool chain. The implementations in Clang libc++ and GCC libstdc++ are for the moment limited to integer numbers.