Search code examples
c++cinteger-overflow

How to do a double-chunk add with no undefined behaviour?


EDIT Public health warning - this question includes a false assumption about undefined behaviour. See accepted answer.

After a reading recent blog post, I've been thinking a lot about the practicality of avoiding all standards-undefined assumptions in C and C++ code. Here is a snippet cut out of C++, to do an unsigned 128-bit addition...

void c_UInt64_Pair::operator+= (const c_UInt64_Pair &p)
{
  m_Low  += p.m_Low;
  m_High += p.m_High;

  if (m_Low < p.m_Low)  m_High++;
}

This clearly relies on assumptions about overflow behaviour. Obviously most machines can support a binary integer of the right kind (though perhaps building from 32-bit chunks or whatever), but there's apparently a growing chance that the optimiser may exploit the standards-undefined behaviour here. That is, the only way that the m_Low < p.m_Low condition can pass is if m_Low += p.m_Low overflows, which is undefined behaviour, so the optimiser can legally decide that the condition always fails. In which case, this code is simply broken.

The question is, therefore...

How can you write a reasonably efficient version of the above without relying on undefined behaviour?

Assume that you have an appropriate 64-bit binary machine integer, but that you have a malicious compiler that will always interpret your undefined behaviour in the worst possible (or impossible) way. Also, assume that you don't have some special built-in, intrinsic, library or whatever to do it for you.

EDIT minor clarification - this isn't just about detecting overflow, but also ensuring that both m_Low and m_High end up with the correct modulo 2^64 results, which is also standards-undefined.


Solution

  • From the C++ 1998 standard, 3.9.1(4): "Unsigned integers, declared unsigned, shall obey the laws of arithmetic modulo 2^n where n is the number of bits in the value representation of that particular size of integer." Note that "integer", here, refers to any integer type rather than just int.

    Therefore, assuming that those are unsigned integers, like the "UInt64" in the type suggests, this is defined behavior in C++ and should work as expected.