Search code examples
cprintfbit-manipulationbitwise-operatorsformat-specifiers

Is it undefined behaviour to use the incorrect format specifiers when bits were masked away


If I have some imaginary uint32_t and I am interested in looking at each byte is it undefined behaviour to use the format specifier for a uint8_t rather than a uint32_t. See below for an example of what I mean.

#include <stdio.h>
#include <inttypes.h>

int main(void)
{
    uint32_t myInt = 0xFF8BA712;
    
    printf("1: %" PRIu8 ", 2: %" PRIu8 ", 3: %" PRIu8 ", 4: %" PRIu8 "\n", myInt & 0xFF, (myInt >> 8) & 0xFF, (myInt >> 16) & 0xFF, (myInt >> 24) & 0xFF);
    
    return 0;
}

Using the compilation command: gcc test.c -Wall -Wextra -pedantic -std=c2x I receive no compilation warnings or errors. This seems to me like it should be OK. However, I do use code like this in a codebase that works with images that have a bit depth of 32bpp and I often need to extract individual bytes from a whole pixel in order to work with them. Therefore, I would like to avoid this undefined behaviour regarding printing these bytes if it exists.


Solution

  • The clang-cl compiler (in Visual Studio 2019) gives four of the following warnings for your code:

    warning : format specifies type 'unsigned char' but the argument has type 'unsigned int' [-Wformat]

    Now, although strictly speaking, passing an argument that is not the appropriate type for its corresponding format specifier is undefined behaviour, the arguments to the printf function will, in any case, be promoted to their respective int equivalents in this case. From this C11 Draft Standard:

    7.21.6.1 The fprintf function


    7     The length modifiers and their meanings are:
           hh     Specifies that a following d, i, o, u, x, or X conversion specifier applies
                     to a signed char or unsigned char argument (the argument will have been
                     promoted according to the integer promotions, but its value shall be converted
                     to signed char or unsigned char before printing); …

    You can remove the warnings by casting each argument (to uint8_t) but the excerpt above suggests (to me, at least) that this would make no real difference:

    #include <stdio.h>
    #include <inttypes.h>
    
    int main(void)
    {
        uint32_t myInt = 0xFF8BA712;
        printf("1: %" PRIu8 ", 2: %" PRIu8 ", 3: %" PRIu8 ", 4: %" PRIu8 "\n",
            (uint8_t)(myInt & 0xFF), (uint8_t)((myInt >> 8) & 0xFF), (uint8_t)((myInt >> 16) & 0xFF),
            (uint8_t)((myInt >> 24) & 0xFF));
    
        return 0;
    }
    

    Note that, on my system, PRIu8 is defined as "hhu" and uint8_t is equivalent to unsigned char – and this is likely to be the case on many other platforms.