c++visual-c++boost boost-spirit boost-spirit-karma

boost spirit karma real generator performance

I was checking boost spirit karma generator performance when I was somewhat surprised by performance degradation when using policy for real numbers. Live on Coliru
The code was taken from boost spirit and a couple of test functions were added. Coliru example replaces the timer used. Note that Coliru aborts long running progs so it may not end all tests.
As one can see policy use degrades performance 2-3 (x10 on coliru) times. Is it expected behavior?

My figures:

sprintf: 0.367
iostreams: 0.818
format: 1.036
karma: 0.087
(string): 0.152
karma (string) with policy: 0.396
karma (rule): 0.12
karma (direct): 0.083
karma (direct) string: 0.089
karma (direct) string with policy: 0.278

Built with x64 VC14

Solution

It's not a regression if you compare apples and pears. In this case, twice.

First apple/pear pair

Here fixed is apples, and scientific is pears.

Not only is the resulting output clearly different, but also arriving it the result requires different steps.

Importantly scientific involves taking the log10 of the input values so as to establish the magnitude of the number in base-10-digits before the decimal point:

By default, real_policies call a "cheap" verdict:

    static int floatfield(T n)
    {
        if (traits::test_zero(n))
            return fmtflags::fixed;

        T abs_n = traits::get_absolute_value(n);
        return (abs_n >= 1e5 || abs_n < 1e-3) 
          ? fmtflags::scientific : fmtflags::fixed;
    }

So you can watch the difference evaporate if you choose a format that would switch to scientific anyways: 123456.123456 instead of 12345.12345...:

clock resolution: mean is 16.9199 ns (40960002 iterations)

benchmarking format_performance_direct_string
collecting 100 samples, 1 iterations each, in estimated 4.7784 ms
mean: 238.81 ns, lb 187.22 ns, ub 493.46 ns, ci 0.95
std dev: 507.559 ns, lb 5.36317 ns, ub 1111.94 ns, ci 0.95
found 11 outliers among 100 samples (11%)
variance is severely inflated by outliers

benchmarking format_performance_direct_string_with_policy
collecting 100 samples, 96 iterations each, in estimated 1699.2 μs
mean: 173.927 ns, lb 172.764 ns, ub 176.939 ns, ci 0.95
std dev: 8.33706 ns, lb 0.256875 ns, ub 16.9312 ns, ci 0.95
found 2 outliers among 100 samples (2%)
variance is moderately inflated by outliers

benchmarking format_performance_string
collecting 100 samples, 84 iterations each, in estimated 1705.2 μs
mean: 312.646 ns, lb 311.027 ns, ub 314.819 ns, ci 0.95
std dev: 9.42479 ns, lb 7.32668 ns, ub 15.2546 ns, ci 0.95
found 1 outliers among 100 samples (1%)
variance is moderately inflated by outliers

benchmarking format_performance_string_with_policy
collecting 100 samples, 31 iterations each, in estimated 1736 μs
mean: 193.572 ns, lb 192.257 ns, ub 200.032 ns, ci 0.95
std dev: 12.8586 ns, lb 0.322008 ns, ub 30.6708 ns, ci 0.95
found 4 outliers among 100 samples (4%)
variance is severely inflated by outliers

As you can see, the custom policy is (predictable) much faster

As Interactive Link:

Second apples/pears pair

This arises where you pinned the precision to 15 digits.

By using a separate head-to-head benchmark of two policies that actually does the precision in addition: http://paste.ubuntu.com/13087371/ you can see that this is what more than loses the benefit of fixing the format to scientific seen above:

clock resolution: mean is 18.6041 ns (40960002 iterations)

benchmarking format_performance_direct_string_with_policy
collecting 100 samples, 1 iterations each, in estimated 1892.9 μs
mean: 228.83 ns, lb 179.9 ns, ub 471.84 ns, ci 0.95
std dev: 483.67 ns, lb 2.29965 ns, ub 1153.98 ns, ci 0.95
found 14 outliers among 100 samples (14%)
variance is severely inflated by outliers

benchmarking format_performance_direct_string_with_policy15
collecting 100 samples, 45 iterations each, in estimated 1858.5 μs
mean: 418.697 ns, lb 410.976 ns, ub 438.865 ns, ci 0.95
std dev: 58.0984 ns, lb 24.1313 ns, ub 115.549 ns, ci 0.95
found 6 outliers among 100 samples (6%)
variance is severely inflated by outliers

benchmarking format_performance_string_with_policy
collecting 100 samples, 87 iterations each, in estimated 1870.5 μs
mean: 262.057 ns, lb 254.73 ns, ub 269.354 ns, ci 0.95
std dev: 37.2502 ns, lb 31.1261 ns, ub 50.5813 ns, ci 0.95
found 17 outliers among 100 samples (17%)
variance is severely inflated by outliers

benchmarking format_performance_string_with_policy15
collecting 100 samples, 42 iterations each, in estimated 1898.4 μs
mean: 458.505 ns, lb 453.626 ns, ub 481.044 ns, ci 0.95
std dev: 45.5401 ns, lb 4.30147 ns, ub 108.045 ns, ci 0.95
found 4 outliers among 100 samples (4%)
variance is severely inflated by outliers

Or in graph: Interactive Link