Search code examples
c++c++11language-lawyerunsigned-integersigned-integer

Is comparing an underflowed, unsigned integer to -1 well-defined?


Consider the following:

size_t r = 0;
r--;
const bool result = (r == -1);

Does the comparison whose result initialises result have well-defined behaviour?
And is its result true, as I'd expect?


This Q&A was written because I was unsure of two factors in particular.
They may both be identified by use of the term "crucial[ly]" in my answer.

This example is inspired by an approach for loop conditions when the counter is unsigned:
for (size_t r = m.size() - 1; r != -1; r--)


Solution

  • size_t r = 0;
    r--;
    const bool result = (r == -1);
    

    Strictly speaking, the value of result is implementation-defined. In practice, it's almost certain to be true; I'd be surprised if there were an implementation where it's false.

    The value of r after r-- is the value of SIZE_MAX, a macro defined in <stddef.h> / <cstddef>.

    For the comparison r == -1, the usual arithmetic conversions are performed on both operands. The first step in the usual arithmetic conversions is to apply the integral promotions to both operands.

    r is of type size_t, an implementation-defined unsigned integer type. -1 is an expression of type int.

    On most systems, size_t is at least as wide as int. On such systems, the integral promotions cause the value of r either to be converted to unsigned int or to keep its existing type (the former can happen if size_t has the same width as int, but a lower conversion rank). Now the left operand (which is unsigned) has at least the rank of the right operand (which is signed). The right operand is converted to the type of the left operand. This conversion yields the same value as r, and so the equality comparison yields true.

    That's the "normal" case.

    Suppose we have an implementation where size_t is 16 bits (let's say it's a typedef for unsigned short) and int is 32 bits. So SIZE_MAX == 65535 and INT_MAX == 2147483647. Or we could have a 32-bit size_t and a 64-bit int. I doubt that any such implementation exists, but nothing in the standard forbids it (see below).

    Now the left side of the comparison has type size_t and value 65535. Since signed int can represent all the values of type size_t, the integral promotions convert the value to 65535 of type int. Both side of the == operator have type int, so the usual arithmetic conversions have nothing to do. The expression is equivalent to 65535 == -1, which is clearly false.

    As I mentioned, this kind of thing is unlikely to happen with an expression of type size_t -- but it can easily happen with narrower unsigned types. For example, if r is declared as an unsigned short, or an unsigned char, or even a plain char on a system where that type is signed, the value of result will probably be false. (I say probably because short or even unsigned char can have the same width as int, in which case result will be true.)

    In practice, you can avoid the potential problem by doing the conversion explicitly rather than relying on the implementation-defined usual arithmetic conversions:

    const bool result = (r == (size_t)-1);
    

    or

    const bool result = (r == SIZE_MAX);
    

    C++11 standard references:

    • 5.10 [expr.eq] Equality operators
    • 5.9 [expr.rel] Relational operators (specifies that the usual arithmetic conversions are performed)
    • 5 [expr] Expressions, paragraph 9: Usual arithmetic conversions
    • 4.5 [conv.prom] Integral promotions
    • 18.2 [support.types] size_t

    18.2 paragraphs 6-7:

    6 The type size_t is an implementation-defined unsigned integer type that is large enough to contain the size in bytes of any object.

    7 [ Note: It is recommended that implementations choose types for ptrdiff_t and size_t whose integer conversion ranks (4.13) are no greater than that of signed long int unless a larger size is necessary to contain all the possible values. — end note ]

    So there's no prohibition on making size_t narrower than int. I can almost plausibly imagine a system where int is 64 bits, but no single object can be bigger than 232-1 bytes so size_t is 32 bits.