Search code examples
cpointersundefined-behaviorc99

Is it safe and defined behaviour to cast a pointer to another level of indirection?


I am working with nested sparse arrays of pointers and want to generalise allocation and initialisation of the pointers at each level.

I decided to use a for loop to iterate over each level, however, to do so, I have to cast pointers to a different level of indirection. The code seems to work, however I was wondering if this is defined behaviour or if it is risky to do so. Are there also better ways of doing this?

Here is a code sample of what I am trying to do -

#include <stdio.h>
#include <stdarg.h>
#include <stdlib.h>
#include <string.h>

typedef char* type;

type *get(type **ptr, int depth, int len, ...) {
    va_list args;
    va_start(args, len);

    while (--depth >= 1) {
        if (!*ptr) {
            *ptr = malloc(len * sizeof(void *));
            memset(*ptr, 0, len * sizeof(void *));
        }
        int i = va_arg(args, int);
        ptr = (type **) (*ptr + i);
    }

    if (!*ptr) {
        *ptr = malloc(len * sizeof(type));
        memset(*ptr, 0, len * sizeof(type));
    }

    int i = va_arg(args, int);
    va_end(args);
    return &(*ptr)[i];
}

int main() {
    type *****a = NULL;
    type *ptr = get((type **) &a, 5, 10, 1, 2, 3, 4, 5);
    *ptr = "test";
    printf("%s\n", a[1][2][3][4][5]);
    return 0;
}

Also, is it possible to make this even more general and remove get's reliance on type? I tried replacing all the casts to type ** with void ** (and also char **) and passing in sizeof(type) as a parameter, but it is causing a segfault in printf. Why is this and is it possible to get around this?


Solution

  • Is it safe and defined behaviour to cast a pointer to another level of indirection?

    It's unclear why you think it might not be safe. In particular, "Level of indirection" is not a property of pointer values or pointer types. Rather, it is an interpretation of their semantic meaning with respect to some data model. That you have an pointer of type T ** does not necessarily mean that you can dereference it twice to get a T. That might or might not be the case. All you really know from that type alone is that it is a pointer type whose referenced type is T *.

    The language spec permits type conversions among different object-pointer types:

    A pointer to an object type can be converted to a pointer to a different object type. If the resulting pointer is not correctly aligned for the referenced type, the behavior is undefined. Otherwise, when converted back again, the result shall compare equal to the original pointer.

    (C24 6.3.3.3/7)

    (C99 contains slightly different wording with the same effect, at paragraph 6.3.2.3/7.)

    In this code ...

        type *****a = NULL;
        type *ptr = get((type **) &a, 5, 10, 1, 2, 3, 4, 5);
    

    ... the type of &a is type ****** (shudder), an object pointer type whose referenced type is type *****. Unless there is an alignment issue, it is allowed to convert such a pointer value to type type **, and doing so does not lose any information necessary to convert back to a type ****** that compares equal to the original. That the types involved have different levels of pointer derivation is largely irrelevant.

    That both types involved have pointer reference types makes it unlikely (but not impossible) that there is any alignment issue. Whether there is or not is a matter of your implementation, but in most implementations, all object pointer types have the same alignment requirement. This is a characteristic that you should find documented for your implementation of interest.

    But you seem to have asked the wrong question.

    It is safe to perform the cast, supposing that there is no alignment issue, but what you can definedly do with the result is limited. You may assign it, pass it to functions, compare it for (in)equality with another pointer of the same type or with a null pointer constant, or convert it to another object pointer type. Attempting to access the object to which it points produces undefined behavior, however, and I would argue that any kind of pointer arithmetic with it and any kind of relational expressions involving it also have UB.

    Your example code does attempt to access the object to which the converted pointer points, both to read it and to write it. Those attempts produce UB.