I was checking boost spirit karma generator performance when I was somewhat surprised by performance degradation when using policy for real numbers.
Live on Coliru
The code was taken from boost spirit and a couple of test functions were added. Coliru example replaces the timer used. Note that Coliru aborts long running progs so it may not end all tests.
As one can see policy use degrades performance 2-3 (x10 on coliru) times. Is it expected behavior?
My figures:
sprintf: 0.367
iostreams: 0.818
format: 1.036
karma: 0.087
(string): 0.152
karma (string) with policy: 0.396
karma (rule): 0.12
karma (direct): 0.083
karma (direct) string: 0.089
karma (direct) string with policy: 0.278
Built with x64 VC14
It's not a regression if you compare apples and pears. In this case, twice.
Here fixed
is apples, and scientific
is pears.
Not only is the resulting output clearly different, but also arriving it the result requires different steps.
Importantly scientific
involves taking the log10
of the input values so as to establish the magnitude of the number in base-10-digits before the decimal point:
By default, real_policies call a "cheap" verdict:
static int floatfield(T n)
{
if (traits::test_zero(n))
return fmtflags::fixed;
T abs_n = traits::get_absolute_value(n);
return (abs_n >= 1e5 || abs_n < 1e-3)
? fmtflags::scientific : fmtflags::fixed;
}
So you can watch the difference evaporate if you choose a format that would switch to scientific anyways: 123456.123456
instead of 12345.12345
...:
clock resolution: mean is 16.9199 ns (40960002 iterations)
benchmarking format_performance_direct_string
collecting 100 samples, 1 iterations each, in estimated 4.7784 ms
mean: 238.81 ns, lb 187.22 ns, ub 493.46 ns, ci 0.95
std dev: 507.559 ns, lb 5.36317 ns, ub 1111.94 ns, ci 0.95
found 11 outliers among 100 samples (11%)
variance is severely inflated by outliers
benchmarking format_performance_direct_string_with_policy
collecting 100 samples, 96 iterations each, in estimated 1699.2 μs
mean: 173.927 ns, lb 172.764 ns, ub 176.939 ns, ci 0.95
std dev: 8.33706 ns, lb 0.256875 ns, ub 16.9312 ns, ci 0.95
found 2 outliers among 100 samples (2%)
variance is moderately inflated by outliers
benchmarking format_performance_string
collecting 100 samples, 84 iterations each, in estimated 1705.2 μs
mean: 312.646 ns, lb 311.027 ns, ub 314.819 ns, ci 0.95
std dev: 9.42479 ns, lb 7.32668 ns, ub 15.2546 ns, ci 0.95
found 1 outliers among 100 samples (1%)
variance is moderately inflated by outliers
benchmarking format_performance_string_with_policy
collecting 100 samples, 31 iterations each, in estimated 1736 μs
mean: 193.572 ns, lb 192.257 ns, ub 200.032 ns, ci 0.95
std dev: 12.8586 ns, lb 0.322008 ns, ub 30.6708 ns, ci 0.95
found 4 outliers among 100 samples (4%)
variance is severely inflated by outliers
As you can see, the custom policy is (predictable) much faster
As Interactive Link:
This arises where you pinned the precision to 15 digits.
By using a separate head-to-head benchmark of two policies that actually does the precision in addition: http://paste.ubuntu.com/13087371/ you can see that this is what more than loses the benefit of fixing the format to scientific
seen above:
clock resolution: mean is 18.6041 ns (40960002 iterations)
benchmarking format_performance_direct_string_with_policy
collecting 100 samples, 1 iterations each, in estimated 1892.9 μs
mean: 228.83 ns, lb 179.9 ns, ub 471.84 ns, ci 0.95
std dev: 483.67 ns, lb 2.29965 ns, ub 1153.98 ns, ci 0.95
found 14 outliers among 100 samples (14%)
variance is severely inflated by outliers
benchmarking format_performance_direct_string_with_policy15
collecting 100 samples, 45 iterations each, in estimated 1858.5 μs
mean: 418.697 ns, lb 410.976 ns, ub 438.865 ns, ci 0.95
std dev: 58.0984 ns, lb 24.1313 ns, ub 115.549 ns, ci 0.95
found 6 outliers among 100 samples (6%)
variance is severely inflated by outliers
benchmarking format_performance_string_with_policy
collecting 100 samples, 87 iterations each, in estimated 1870.5 μs
mean: 262.057 ns, lb 254.73 ns, ub 269.354 ns, ci 0.95
std dev: 37.2502 ns, lb 31.1261 ns, ub 50.5813 ns, ci 0.95
found 17 outliers among 100 samples (17%)
variance is severely inflated by outliers
benchmarking format_performance_string_with_policy15
collecting 100 samples, 42 iterations each, in estimated 1898.4 μs
mean: 458.505 ns, lb 453.626 ns, ub 481.044 ns, ci 0.95
std dev: 45.5401 ns, lb 4.30147 ns, ub 108.045 ns, ci 0.95
found 4 outliers among 100 samples (4%)
variance is severely inflated by outliers
Or in graph: Interactive Link