Search code examples
cassertinteger-overflow

"Overflow" assertions on various C data type: which ones are guaranteed to be true according to the C Standard?


While all those assertions hold true on my system, I am obviously calling several undefined and/or implementation-specific behaviors. Some of which are apparently not actual overflow.

See this comment for reference: this is the reason why I am asking this question.

num = num + 1 does not cause an overflow. num is automatically promoted to int, and then the addition is performed in int, which yields 128 without overflow. Then the assignment performs a conversion to char.

This is not an overflow but, per C 2018 6.3.1.3, produces an implementation-defined result or signal. This differs from overflow because the C standard does not specify the behavior upon overflow at all, but, in this code, it specifies that the implementation must define the behavior. - Eric Postpischil

I put in comment what I believe to be the actual behavior.

Because I have relied on misconceptions, I prefer not to assume anything.

#include <limits.h>
#include <assert.h>
#include <stdint.h>
#include <stddef.h>

int main(void)
{
    signed char     sc = CHAR_MAX;
    unsigned char   uc = UCHAR_MAX;
    signed short    ss = SHRT_MAX;
    unsigned short  us = USHRT_MAX;
    signed int      si = INT_MAX;
    unsigned int    ui = UINT_MAX;
    signed long     sl = LONG_MAX;
    unsigned long   ul = ULONG_MAX;
    size_t          zu = SIZE_MAX;

    ++sc;
    ++uc;
    ++ss;
    ++us;
    ++si;
    ++ui;
    ++sl;
    ++ul;
    ++zu;
    assert(sc == CHAR_MIN); //integer promotion, implementation specific ?
    assert(uc == 0); //integer promotion, implementation specific ?
    assert(ss == SHRT_MIN); //integer promotion, implementation specific ? 
    assert(us == 0); //integer promotion, implementation specific ?
    assert(si == INT_MIN); //overflow & undefined
    assert(ui == 0); //wrap around: Guaranteed
    assert(sl == LONG_MIN); //overflow & undefined ?
    assert(ul == 0); //wrap around: Guaranteed ?
    assert(zu == 0); //wrap around : Guaranteed ?
    return (0);
}

Solution

  • All citations below are from C 2018, official version.

    Signed Integers Narrower Than int, Binary +

    Let us discuss this case first since it is the one that prompted this question. Consider this code, which does not appear in the question:

    signed char     sc = SCHAR_MAX;
    sc = sc + 1;
    assert(sc == SCHAR_MIN);
    

    6.5.6 discusses the binary + operator. Paragraph 4 says the usual arithmetic conversions are performed on them. This results in the sc in sc + 1 being converted to int1, and 1 is already int. So sc + 1 yields one more than SCHAR_MAX (commonly 127 + 1 = 128), and there is no overflow or representation problem in the addition.

    Then we must perform the assignment, which is discussed in 6.5.16.1. Paragraph 2 says “… the value of the right operand is converted to the type of the assignment expression and replaces the value stored in the object designated by the left operand.” So we must convert this value greater than SCHAR_MAX to signed char, and it clearly cannot be represented in signed char.

    6.3.1.3 tells us about the conversions of integers. Regarding this situation, it says “… Otherwise, the new type is signed and the value cannot be represented in it; either the result is implementation-defined or an implementation-defined signal is raised.”

    Thus, we have an implementation-defined result or signal. This differs from overflow, which is what happens when, during evaluation of an expression, the result is not representable. 6.5 5 says “If an exceptional condition occurs during the evaluation of an expression (that is, if the result is not mathematically defined or not in the range of representable values for its type), the behavior is undefined.” For example, if we evaluate INT_MAX + 1, then both INT_MAX and 1 have type int, so the operation is performed with type int, but the mathematical result is not representable in int, so this is an exceptional condition, and the behavior is not defined by the C standard. In contrast, during the conversion, the behavior is partially defined by the standard: The standard requires the implementation to define the behavior, and it must either produce a result it defines or define a signal.

    In many implementations, the assertion will evaluate to true. See the “Signed Integers Not Narrower Than int” section below for further discussion.

    Signed Integers Narrower Than int, Prefix ++

    Next, consider this case, extracted from the question, except that I changed CHAR_MAX and and CHAR_MIN to SCHAR_MAX and SCHAR_MIN to match the signed char type:

    signed char     sc = SCHAR_MAX;
    ++sc;
    assert(sc == SCHAR_MIN);
    

    We have unary ++ instead of binary +. 6.5.3.1 2 says “The value of the operand of the prefix ++ is incremented…” This clause does not explicitly say the usual arithmetic conversions or integer promotions are performed, but it does say, also in paragraph 2, “See the discussions of additive operators and compound assignment for information on constraints, types, side effects, and conversions and the effects of operations on pointers.” That tells us it behaves like sc = sc + 1;, and the above section about binary + applies to prefix ++, so the behavior is the same.

    Unsigned Integers Narrower Than int, Binary +

    Consider this code modified to use binary + instead of prefix ++:

    unsigned char   uc = UCHAR_MAX;
    uc = uc + 1;
    assert(uc == 0);
    

    As with signed char, the arithmetic is performed with int and then converted to the assignment destination type. This conversion is specified by 6.3.1.3: “Otherwise, if the new type is unsigned, the value is converted by repeatedly adding or subtracting one more than the maximum value that can be represented in the new type until the value is in the range of the new type.” Thus, from the mathematical result (UCHAR_MAX + 1), one more than the maximum (also UCHAR_MAX + 1) is subtracted until the value is in range. A single subtraction yields 0, which is in range, so the result is 0, and the assertion is true.

    Unsigned Integers Narrower Than int, Prefix ++

    Consider this code extracted from the question:

    unsigned char   uc = UCHAR_MAX;
    ++uc;
    assert(uc == 0);
    

    As with the earlier prefix ++ case, the arithmetic is the same as uc = uc + 1, discussed above.

    Signed Integers Not Narrower Than int

    In this code:

    signed int      si = INT_MAX;
    ++si;
    assert(si == INT_MIN);
    

    or this code:

    signed int      si = INT_MAX;
    si = si + 1;
    assert(si == INT_MIN);
    

    the arithmetic is performed using int. In either case, the computation overflows, and the behavior is not defined by the C standard.

    If we ponder what implementations will do, several possibilities are:

    • In a two’s complement implementation, the bit pattern resulting from adding 1 to INT_MAX overflows to the bit pattern for INT_MIN, and this is the value the implementation effectively uses.
    • In a one’s complement implementation, the bit pattern resulting from adding 1 to INT_MAX overflows to the bit pattern for INT_MIN, although it is a different value than we are familiar with for INT_MIN (−2−31+1 instead of −2−31).
    • In a sign-and-magnitude implementation, the bit pattern resulting from adding 1 to INT_MAX overflows to the bit pattern for −0.
    • The hardware detects overflow, and a signal occurs.
    • The compiler detects the overflow and transform the code in unexpected ways during optimization.

    Unsigned Integers Not Narrower than int

    This cases are unremarkable; the behavior is the same as for the narrower-than-int cases discussed above: The arithmetic wraps.

    Footnote

    1 Per discussion elsewhere in Stack Overflow, it may be theoretically possible for the char (and signed char) type to be as wide as an int. This strains the C standard regarding EOF and possibly other issues and was certainly not anticipated by the C committee. This answer disregards such esoteric C implementations and considers only implementations in which char is narrower than int.