Search code examples
carrayslanguage-lawyerundefined-behaviorflexible-array-member

Can I "over-extend" an array by allocating more space to the enclosing struct?


Frankly, is such a code valid or does it produce UB?

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct __attribute__((__packed__)) weird_struct
{
    int some;
    unsigned char value[1];
};

int main(void)
{
    unsigned char text[] = "Allie has a cat";
    struct weird_struct *ws =
        malloc(sizeof(struct weird_struct) + sizeof(text) - 1);
    ws->some = 5;
    strcpy(ws->value, text);
    printf("some = %d, value = %s\n", ws->some, ws->value);
    free(ws);
    return 0;
}

http://ideone.com/lpByQD

I’d never think it is valid to something like this, but it would seem that SystemV message queues do exactly that: see the man page.

So, if SysV msg queues can do that, perhaps I can do this too? I think I’d find this useful to send data over the network (hence the __attribute__((__packed__))).

Or, perhaps this is a specific guarantee of SysV msg queues and I shouldn’t do something like that elsewhere? Or, perhaps this technique can be employed, only I do it wrongly? I figured out I’d better ask.

This - 1 in malloc(sizeof(struct weird_struct) + sizeof(text) - 1) is because I take into account that one byte is allocated anyway thanks to unsigned char value[1] so I can subtract it from sizeof(text).


Solution

  • The standard C way (since C99) to do this would be using flexible array member. The last member of the structure needs to be incomplete array type and you can allocate required amount of memory at runtime.

    Something like

    struct __attribute__((__packed__)) weird_struct
    {
        int some;
        unsigned char value [ ];   //nothing, no 0, no 1, no nothing...
    }; 
    

    and later

    struct weird_struct *ws =
        malloc(sizeof(struct weird_struct) + strlen("this to be copied") + 1);
    

    or

    struct weird_struct *ws =
        malloc(sizeof(struct weird_struct) + sizeof("this to be copied"));
    

    will do the job.

    Related, quoting the C11 standard, chapter §6.7.2.1

    As a special case, the last element of a structure with more than one named member may have an incomplete array type; this is called a flexible array member. In most situations, the flexible array member is ignored. In particular, the size of the structure is as if the flexible array member were omitted except that it may have more trailing padding than the omission would imply. However, when a . (or ->) operator has a left operand that is (a pointer to) a structure with a flexible array member and the right operand names that member, it behaves as if that member were replaced with the longest array (with the same element type) that would not make the structure larger than the object being accessed; the offset of the array shall remain that of the flexible array member, even if this would differ from that of the replacement array. If this array would have no elements, it behaves as if it had one element but the behavior is undefined if any attempt is made to access that element or to generate a pointer one past it.


    Related to the one-element array usage, from online gcc manual page for zero-length array support option

    struct line {
      int length;
      char contents[0];
    };
    
    struct line *thisline = (struct line *)
      malloc (sizeof (struct line) + this_length);
    thisline->length = this_length;
    

    In ISO C90, you would have to give contents a length of 1, which means either you waste space or complicate the argument to malloc.

    which also answers the -1 part in the malloc() argument, as sizeof(char) is guaranteed to be 1 in C.