I've used std::chrono
for years and have watched many of Howard Hinnant's
talks on the design and use of the library. I like it, and I think I generally
understand it. However, recently, I suddenly realized that I didn't know how to
use it practically and safely to avoid undefined behaviour.
Please bear with me while I walk through a few cases to set the stage for my question.
Let's start with what I think is the "simplest" std::chrono::duration
type,
nanoseconds
. Its minimum rep
size is 64-bit, which means in practice it
will be std::int64_t
and there are thus probably no "leftover" optional
representational bits that aren't required to be there by the standard.
This function is obviously not always safe:
nanoseconds f1(nanoseconds value)
{ return ++value; }
If value
is nanoseconds::max()
, then this overflows, which we can confirm
with clang 7's UBSan (-fsanitize=undefined
):
runtime error: signed integer overflow: 9223372036854775807 + 1 cannot be
represented in type 'std::__1::chrono::duration<long long,
std::__1::ratio<1, 1000000000> >::rep' (aka 'long long')
But this is nothing special. It's no different than the typical integer case:
std::int64_t f2(std::int64_t value)
{ return ++value; }
When we can't be sure that value
isn't already its maximum, we check first,
and handle the error however we deem appropriate. For example:
nanoseconds f3(nanoseconds value)
{
if(value == value.max())
{
throw std::overflow_error{"f3"};
}
return ++value;
}
If we have an existing (unknown) nanoseconds
value that we want to add another
(unknown) nanoseconds
value to, the naive approach is:
struct Foo
{
// Pretend this can be set in other meaningful ways so we
// don't know what it is.
nanoseconds m_nanos = nanoseconds::max();
nanoseconds f4(nanoseconds value)
{ return m_nanos + value; }
};
And again, we'll get into trouble:
runtime error: signed integer overflow: 9223372036854775807 +
9223372036854775807 cannot be represented in type 'long long'
Foo{}.f4(nanoseconds::max()) = -2 ns
So, again, we can do as we would with the integers, but it's already getting trickier because these are signed integers:
struct Foo
{
explicit Foo(nanoseconds nanos = nanoseconds::max())
: m_nanos{nanos}
{}
// Again, pretend this can be set in other ways, so we don't
// know what it is.
nanoseconds m_nanos;
nanoseconds f5(nanoseconds value)
{
if(m_nanos > m_nanos.zero() && value > m_nanos.max() - m_nanos)
{
throw std::overflow_error{"f5+"};
}
else if(m_nanos < m_nanos.zero() && value < m_nanos.min() - m_nanos)
{
throw std::overflow_error{"f5-"};
}
return m_nanos + value;
}
};
Foo{}.f5(0ns) = 9223372036854775807 ns
Foo{}.f5(nanoseconds::min()) = -1 ns
Foo{}.f5(1ns) threw std::overflow_error: f5+
Foo{}.f5(nanoseconds::max()) threw std::overflow_error: f5+
Foo{nanoseconds::min()}.f5(0ns) = -9223372036854775808 ns
Foo{nanoseconds::min()}.f5(nanoseconds::max()) = -1 ns
Foo{nanoseconds::min()}.f5(-1ns) threw std::overflow_error: f5-
Foo{nanoseconds::min()}.f5(nanoseconds::min()) threw std::overflow_error: f5-
I think I got that right. It's starting to get harder to be certain if the code is correct.
So far, things might seem manageable, but what about this case?
nanoseconds f6(hours value)
{ return m_nanos + value; }
We have the same problem as we did with f4()
. Can we solve it the same way as
f5()
did? Let's use the same body as f5()
, but just change the argument
type, and see what happens:
nanoseconds f7(hours value)
{
if(m_nanos > m_nanos.zero() && value > m_nanos.max() - m_nanos)
{
throw std::overflow_error{"f7+"};
}
else if(m_nanos < m_nanos.zero() && value < m_nanos.min() - m_nanos)
{
throw std::overflow_error{"f7-"};
}
return m_nanos + value;
}
This seems sane because we're still checking if there's room between
nanoseconds::max()
and m_nanos
to add in value
. So what happens when we
run this?
Foo{}.f7(0h) = 9223372036854775807 ns
/usr/lib/llvm-7/bin/../include/c++/v1/chrono:880:59: runtime error: signed
integer overflow: -9223372036854775808 * 3600000000000 cannot be represented
in type 'long long'
Foo{}.f7(hours::min()) = 9223372036854775807 ns
Foo{}.f7(1h) threw std::overflow_error: f7+
Foo{}.f7(hours::max()) DIDN'T THROW!!!!!!!!!!!!!!
Foo{nanoseconds::min()}.f7(0h) = -9223372036854775808 ns
terminating with uncaught exception of type std::overflow_error: f7-
Aborted
Oh my. That definitely didn't work.
In my test driver, the UBSan error is printed above the call that it is
reporting on, so the first failure is Foo{}.f7(hours::min())
. But that case
shouldn't even throw, so why did it fail?
The answer is that even the act of comparing hours
to nanoseconds
involves
a conversion. This is because the comparison operators are implemented via the
use of std::common_type
, which std::chrono
defines for duration
types in
terms of the greatest common divisor of the period
values. In our case,
that's nanoseconds
, so first, the hours
is converted to nanoseconds
. A
snippet from libc++
shows part of this:
template <class _LhsDuration, class _RhsDuration>
struct __duration_lt
{
_LIBCPP_INLINE_VISIBILITY _LIBCPP_CONSTEXPR
bool operator()(const _LhsDuration& __lhs, const _RhsDuration& __rhs) const
{
typedef typename common_type<_LhsDuration, _RhsDuration>::type _Ct;
return _Ct(__lhs).count() < _Ct(__rhs).count();
}
};
Since we didn't check that our hours
value
was small enough to fit in
nanoseconds
(on this particular standard library implementation, with its
particular rep
type choices), the following are essentially equivalent:
if(m_nanos > m_nanos.zero() && value > m_nanos.max() - m_nanos)
if(m_nanos > m_nanos.zero() && nanoseconds{value} > m_nanos.max() - m_nanos)
As an aside, the same problem will exist if hours
uses a 32-bit rep
:
runtime error: signed integer overflow: 2147483647 * 3600000000000 cannot be
represented in type 'long long'
Of course, if we make the value
small enough, including by limiting the rep
size, we can end up making it fit . . . since obviously some hours
values
can be represented as nanoseconds
or conversions would be pointless.
Let's not give up yet. Conversions are another important case anyway, so we should know how to handle them safely. Surely that can't be too hard.
The first hurdle is that we need to know if we can even get from hours
to
nanoseconds
without overflowing the nanoseconds::rep
type. Again, do as we
would with integers and do a multiplication overflow check. For the moment,
let's ignore negative values. We could do this:
nanoseconds f8(hours value)
{
assert(value >= value.zero());
if(value.count()
> std::numeric_limits<nanoseconds::rep>::max() / 3600000000000)
{
throw std::overflow_error{"f8+"};
}
return value;
}
It seems to work if we test it against the limits of our standard library's
choice of nanoseconds::rep
:
f8(0h) = 0 ns
f8(1h) = 3600000000000 ns
f8(2562047h) = 9223369200000000000 ns
f8(2562048h) threw std::overflow_error: f8+
f8(hours::max()) threw std::overflow_error: f8+
But, there are some pretty serious limitations. First, we had to "know" how to
convert between hours
and nanoseconds
, which kind of defeats the point.
Secondly, this only handles these very particular two types with a very nice
relationship between their period
types (where only a single multiplication is
needed).
Imagine we wanted to implement an overflow-safe conversion of just the standard
named duration
types, supporting only the lossless conversions:
template <typename target_duration, typename source_duration>
target_duration lossless(source_duration duration)
{
// ... ?
}
It seems like we need to compute relationships between ratios and make decisions
and check the multiplications based on that . . . and once we've done that,
we've had to understand and re-implement all of the logic in the duration
operators (but now with overflow safety) that we originally set out to use in
the first place! We can't really need to implement the type just to use the
type, can we?
Plus, when we're done, we just have some function, lossless()
, that performs a
conversion if we explicitly call it instead allowing natural implicit
conversions, or some other function that adds a value if we explicitly call it
instead of using operator+()
, so we've lost the expressiveness that is a huge
part of the value of duration
.
Add into the mix lossy conversions with duration_cast
and it seems hopeless.
I'm not even sure how I would approach dealing with something as simple as this:
template <typename duration1, typename duration2>
bool isSafe(duration1 limit, duration2 reading)
{
assert(limit >= limit.zero());
return reading < limit / 2;
}
Or, worse, this, even if I knew some things about grace
:
template <typename duration1, typename duration2>
bool isSafe2(duration1 limit, duration2 reading, milliseconds grace)
{
assert(limit >= limit.zero());
assert(grace >= grace.zero());
const auto test = limit / 2;
return grace < test && reading < (test - grace);
}
If duration1
and duration2
can really be any duration
type (including
things like std::chrono::duration<std::int16_t, std::ratio<3, 7>>
, I can't see
a way to proceed with confidence. But even if we limit ourselves to "normal"
duration
types, there are a lot of scary outcomes.
In some ways this situation is no "worse" than dealing with normal fixed-size
integers, like everyone does every day, where you often "ignore" the possibility
of overflow because you "know" the domain of values you're working with. But,
surprisingly to me, these types of solutions seem "worse" with std::chrono
than they do with normal integers because as soon as you try to be safe with
respect to overflow, you end up defeating the benefits of using std::chrono
in
the first place.
If I make my own duration
types based on an unsigned rep
, I guess I
technically avoid at least some of the undefined behaviour from an integer
overflow point of view, but I still can easily get garbage results from
"careless" calculations. The "problem space" seems to be the same.
I'm not interested in a solution based on floating point types. I'm using
std::chrono
to keep the precise resolutions I choose in each case. If I
didn't care about being precise or rounding errors, I could easily just use
double
to count seconds everywhere and not mix units. But if that was a
viable solution for every problem, we wouldn't have std::chrono
(or even
struct timespec
, for that matter).
So my question is, how do I safely and practically use std::chrono
to do
something as simple as add two values together of different durations without
fear of undefined behaviour because of integer overflow? Or do lossless
conversions safely? I haven't figured out a practical solution even with known
simple duration
types, let alone the rich universe of all possible duration
types. What am I missing?
The highest performing answer is to know your domain, and don't program anywhere near the maximum range of the precision you're using. If you're using nanoseconds
, the range is +/- 292 years. Don't go near that far. If you need more range than just +/- 100 years, use a coarser resolution than nanoseconds.
If you can follow those rules, then you can just not worry about overflow.
Sometimes you can't. For example if your code must handle untrusted input, or general input (e.g. a general purpose library), then you really do need to check for overflow.
One technique is to choose a rep
just for comparison that can handle more range than anyone needs, just for the comparison. int128_t
and double
are two tools I reach for in this case. For example, here is a checked_convert
that checks for overflow using double
prior to actually performing the duration_cast
:
template <class Duration, class Rep, class Period>
Duration
checked_convert(std::chrono::duration<Rep, Period> d)
{
using namespace std::chrono;
using S = duration<double, typename Duration::period>;
constexpr S m = Duration::min();
constexpr S M = Duration::max();
S s = d;
if (s < m || s > M)
throw std::overflow_error("checked_convert");
return duration_cast<Duration>(d);
}
It is significantly more expensive. But if you are writing (for example) std::thread::sleep_for
, it is worth the expense.
If for some reason you can't using floating point even for checks, I have experimented with lcm_type
(not a great name). This is the opposite of common_type_t<Duration1, Duration2>
. Instead of finding the duration
that both input duration
s can convert to without loss (without a division), it finds the duration
that both input duration
s can convert to without a multiplication. For example lcm_type_t<milliseconds, nanoseconds>
has type milliseconds
. Such a conversion can not overflow.
template <class Duration0, class ...Durations>
struct lcm_type;
template <class Duration>
struct lcm_type<Duration>
{
using type = Duration;
};
template <class Duration1, class Duration2>
struct lcm_type<Duration1, Duration2>
{
template <class D>
using invert = std::chrono::duration
<
typename D::rep,
std::ratio_divide<std::ratio<1>, typename D::period>
>;
using type = invert<typename std::common_type<invert<Duration1>,
invert<Duration2>>::type>;
};
template <class Duration0, class Duration1, class Duration2, class ...Durations>
struct lcm_type<Duration0, Duration1, Duration2, Durations...>
{
using type = typename lcm_type<
typename lcm_type<Duration0, Duration1>::type,
Duration2, Durations...>::type;
};
template <class ...T>
using lcm_type_t = typename lcm_type<T...>::type;
You can convert both input durations to lcm_type_t<Duration1, Duration2>
, without fear of overflow, then do the comparison.
The problem with this technique is that it is not precise. Two slightly different durations might convert to the lcm_type_t
and because of truncation losses, compare equal. For this reason I prefer the solution with double
, but it is good to have lcm_type
in your toolbox too.