Search code examples
ctypesscanfstrtol

sscanf() hex ints into an array of ints vs. unsigned chars


I am converting a string representation of a mac address into an array of UINT8s defined as unsigned char. I am curious why sscanf() will read all 0s when I read into an array of UINT8s and actual values when I read into an array of regular 32 bit ints. Its almost like it's chopping off the 8 bits of the wrong end of the int.

char *strMAC = "11:22:33:AA:BB:CC";

typedef unsigned char UINT8;
UINT8 uMAC[6];

int iMAC[6];

sscanf( (const char*) strMac, 
        "%x:%x:%x:%x:%x:%x", 
        &uMAC[0], &uMAC[1], &uMAC[2], &uMAC[3], &uMAC[4], &uMAC[5] );
printf( "%x:%x:%x:%x:%x:%x", 
        uMAC[0], uMAC[1], uMAC[2], uMAC[3], uMAC[4], uMAC[5] );
// output: 0:0:0:0:0:0

sscanf( (const char*) strMac, 
        "%x:%x:%x:%x:%x:%x", 
        &iMAC[0], &iMAC[1], &iMAC[2], &iMAC[3], &iMAC[4], &iMAC[5] );
printf( "%x:%x:%x:%x:%x:%x", 
        iMAC[0], iMAC[1], iMAC[2], iMAC[3], iMAC[4], iMAC[5] );
// output: 11:22:33:AA:BB:CC

Update: %hhx will work for C99 and above, but I have an old codebase so I ended up going with strtoul():

char *str = strMac;
int i = 0;
for(i = 0; i < 6; i++, str+=3) {
    uMAC[i] = strtoul(str, NULL, 16);
}

Solution

  • TL;DR - The first snippet invkoes UB because of argument type mismatch.


    To elaborate, quoting the requirement of argument type for %x format specifier, from C11 standard, chapter §7.21.6.2, fscanf() function, (emphasis mine)

    x Matches an optionally signed hexadecimal integer, whose format is the same as expected for the subject sequence of the strtoul() function with the value 16 for the base argument. The corresponding argument shall be a pointer to unsigned integer.

    So, while using

     sscanf( (const char*) strMac, 
        "%x:%x:%x:%x:%x:%x", 
        &uMAC[0], &uMAC[1], &uMAC[2], &uMAC[3], &uMAC[4], &uMAC[5] );
    

    in your code, you're supplying wrong type of argument for %x. According to the standard, again,

    [...]. Unless assignment suppression was indicated by a *, the result of the conversion is placed in the object pointed to by the first argument following the format argument that has not already received a conversion result. If this object does not have an appropriate type, or if the result of the conversion cannot be represented in the object, the behavior is undefined.

    So, providing a wrong type as argument is invoking undefined behaviour.


    Solution:

    To indicate that you're going to supply a (signed or unsigned) char type argument, you need to use the format specifier prefixed with the hh length modifier, like %hhx with scanf() family.