Search code examples
cscanflanguage-lawyerundefined-behavior

Is it OK to pass the address of an int for scanf("%x", ...)?


Does the following code have defined beavior:

#include <stdio.h>

int main() {
    int x;
    if (scanf("%x", &x) == 1) {
        printf("decimal: %d\n", x);
    }
    return 0;
}

clang compiles it without any warnings even with all warnings enabled, including -pedantic. The C Standard seems unambiguous about this:

C17 7.21.6.2 The fscanf function

...

... the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.

...

The conversion specifiers and their meanings are:

...

x Matches an optionally signed hexadecimal integer, whose format is the same as expected for the subject sequence of the strtoul function with the value 16 for the base argument. The corresponding argument shall be a pointer to unsigned integer.

On two's complement architectures, converting -1 with %x seems to work, but it would not on ancient sign/magnitude or ones complement systems.

Is there any provision to make this behavior defined or at least implementation defined?


Solution

  • This falls in the category of behaviors which quality implementations should support unless they document a good reason for doing otherwise, but which the Standard does not mandate. The authors of the Standard seem to have refrained from trying to list all such behaviors, and there are at least three good reasons for that:

    1. Doing so would have made the Standard longer, and spending ink describing obvious behaviors that readers would expect anyway would distract from the places where the Standard needed to call readers' attention to things that they might not otherwise expect.

    2. The authors of the Standard may not have wanted to preclude the possibility that an implementation might have a good reason for doing something unusual. I don't know whether that was a consideration in your particular case, but it could have been.

      Consider, for example, a (likely theoretical) environment whose calling convention that requires passing information about argument types fed to variadic functions, and that supplies a scanf function that validates those argument types and squawks if int* is passed to a %X argument. The authors of the Standard were almost certainly not aware of any such environment [I doubt any ever existed], and thus would be in no position to weigh the benefits of using the environment's scanf routine versus the benefits of supporting the common behavior. Thus, it would make sense to leave such judgment up to people who would be in a better position to assess the costs and benefits of each approach.

    3. It would be extremely difficult for the authors of the Standard to ensure that they exhaustively enumerated all such cases without missing any, and the more exhaustively they were to attempt to enumerate such cases, the more likely it would be that accidental omissions would be misconstrued as deliberate.

    In practice, some compiler writers seem to regard most situations where the Standard fails to mandate the behavior of some action as an invitation to assume code will never attempt it, even if all implementations prior to the Standard had behaved consistently and it's unlikely there would ever be any good reason for an implementation to do otherwise. Consequently, using %X to read an int falls in the category of behaviors that will be reliable on implementations that make any effort to be compatible with common idioms, but could fail on implementations whose designers place a higher value on being able to process useless programs more efficiently, or on implementations that are designed to squawk when given programs that could be undermined by such implementations.