Search code examples
cpointerslanguage-lawyerundefined-behaviorimplicit-conversion

Is passing pointer to an array as pointer to pointer UB in C?


I have such a code:

#include <stdlib.h>
#include <stdio.h>

void func(int **b)
{
    printf("b = %p\n", b); // 0x7ffe76932330
    *b = *b + 1;
}

int main(void)
{
    int b[10] = {0};

    printf("b = %p\n", &b[0]); // 0x7ffe76932330
    printf("%d\n", b[0]);      // 0

    func(&b);

    printf("%d\n", b[0]); // 4
    return 0;
}

Does this code have UB? To me it seems so, at least due to different types without explicit casting int (*)[10] != int **.

Also, what if I have char b[] = "some string"; instead? The behavior is almost the same... weird.


Solution

  • Passing the pointer by itself isn't necessarily undefined behavior, but subsequently using the converted pointer is.

    C allows conversions from one object type to another and back, as documented in section 6.2.3.2p7 of the C standard:

    A pointer to an object type may be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer. When a pointer to an object is converted to a pointer to a character type, the result points to the lowest addressed byte of the object. Successive increments of the result, up to the size of the object, yield pointers to the remaining bytes of the object.

    So assuming there's no alignment issue (i.e. the array starts on an 8 byte offset on a 64 bit system), just the action of passing a int (*)[10] to a function expecting an int ** is allowed, although most compilers will warn about converting incompatible pointer types.

    The undefined behavior happens here:

    *b = *b + 1;
    

    Because you're derferencing an object though an incompatible pointer type (other than a char *). The rules regarding what you're allowed to dereferences are listed in section 6.5p7:

    An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

    • a type compatible with the effective type of the object,
    • a qualified version of a type compatible with the effective type of the object,
    • a type that is the signed or unsigned type corresponding to the effective type of the object,
    • a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,
    • an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or
    • a character type.

    Dereferencing a int (*)[10] as a int ** doesn't meet any of the above criteria, so *b is undefined behavior.