Search code examples
clanguage-lawyerundefined-behaviorinteger-promotion

Is it UB to give a char argument to printf where printf expects a int?


Do I understand the standard correctly that this program cause UB:

#include <stdio.h>

int main(void)
{
    char a = 'A';
    printf("%c\n", a);
    return 0;
}

When it is executed on a system where sizeof(int)==1 && CHAR_MIN==0?

Because if a is unsigned and has the same size (1) as an int, it will be promoted to an unsigned int [1] (2), and not to an int, since a int can not represent all values of a char. The format specifier "%c" expects an int [2] and using the wrong signedness in printf() causes UB [3].

Relevant quotes from ISO/IEC 9899 for C99

[1] Promotion to int according to C99 6.3.1.1:2:

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

[2] The format specifier "%c" expects an int argument, C99 7.19.6.1:8 c:

If no l length modifier is present, the int argument is converted to an unsigned char, and the resulting character is written.

[3] Using the wrong type in fprintf() (3), including wrong signedness, causes UB according to C99 7.19.6.1:9:

... If any argument is not the correct type for the corresponding conversion specification, the behavior is undefined.

The exception for same type with different signedness is given for the va_arg macro but not for printf() and there is no requirement that printf() uses va_arg (4).

Footnotes: (marked with (n))

  1. This implies INT_MAX==SCHAR_MAX, because char has no padding.

  2. See also this question: Is unsigned char always promoted to int?

  3. The same rules are applied to printf(), see C99 7.19.6.3:2

  4. See also this question: Does printf("%x",1) invoke undefined behavior?


Solution

  • TL;DR there is no UB (in my interpretation at any rate).

    6.2.5 types
    6. For each of the signed integer types, there is a corresponding (but different) unsigned integer type (designated with the keyword unsigned) that uses the same amount of storage (including sign information) and has the same alignment requirements.
    9. The range of nonnegative values of a signed integer type is a subrange of the corresponding unsigned integer type, and the representation of the same value in each type is the same 41)
    41) The same representation and alignment requirements are meant to imply interchangeability as arguments to functions, return values from functions, and members of unions.

    Furthermore

    7.16.1.1 The va_arg macro
    2 The va_arg macro expands to an expression that has the specified type and the value of the next argument in the call. [...] If there is no actual next argument, or if type is not compatible with the type of the actual next argument (as promoted according to the default argument promotions), the behavior is undefined, except for the following cases:

    • one type is a signed integer type, the other type is the corresponding unsigned integer type, and the value is representable in both types;

    7.21.6.8 The vfprintf function
    288) [...] functions vfprintf, vfscanf, vprintf, vscanf, vsnprintf, vsprintf, and vsscanf invoke the va_arg macro [...]

    Thus, it stands to reason that an unsigned type is not "an incorrect type for the corresponding (signed) conversion specification", as long as the value is within the range.

    This is corroborated by the fact that major compilers do not warn about signed/unsigned format specification mismatch, even though they do warn about other mismatches, even when the corresponding types have the same representation on a given platform (e.g. long and long long).