Search code examples
arrayscmallocflexible-array-member

Understanding malloc and flexible array members


I am curious of how malloc() actually allocates memory. I am reading C programming by K.N.King for reference. In particular, chapter 17. Initially in the chapter void *malloc(size_t size) is described as function which allocates a block of memory of size bytes and returns a void * pointer to this memory block. Two main applications being dynamically allocating strings and arrays. When malloc() is called the returned pointer will be cast to the appropriate type, for example in,

int *n;
n = malloc(sizeof(*n));

the malloc() return will be cast to (int *). It was my understanding that since this cast occurs the memory block allocated by the call of malloc() contains uninitialised integers. However, after more reading in the chapter I think I am wrong but I can't figure out exactly what is going on.

The conflict in understanding occurred when reading the last section of the chapter on flexible array members. The idea of defining a structure with the last member being "incomplete", i.e.,

struct vstring {
    int len;
    char chars[];
};

The author then goes on to say, and I quote:

A structure that contains a flexible array member is an incomplete type. An incomplete type is missing part of the information needed to determine how much memory it requires. ... In particular, an incomplete type can't be a member of another structure or an element of an array. However, and array may contains pointers to structure that have a flexible array member.

So clearly my earlier understand must be flawed otherwise a call such as,

struct vstring *str = malloc(sizeof(struct vstring) + n);

would allocate an array containing an incomplete type.

Is the block of memory allocated by malloc() an array of a particular type after being cast? If not, then how can the following work,

struct node {
    int value;
    struct node *next;
};

struct node *new_node = malloc(sizeof(*new_node));
new_node->value = 10;

if the memory allocated by the malloc() call is not declared as elements of struct node? Even the integer array example I put at the beginning of the post, I would be able to access the elements of the the allocated memory by subscripting n immediately.


Solution

  • the malloc() return will be cast to (int *).

    There is no cast in n = malloc(sizeof(*n));. A cast is an explicit operator in source code, the way a plus sign is an operator for addition or an asterisk is an operator for multiplication. A cast is a type name in parentheses. In n = malloc(sizeof(*n));, the value returned by malloc is automatically converted to the type of n, which is int *. This is a conversion, not a cast.

    Is the block of memory allocated by malloc() an array of a particular type after being cast?

    The memory allocated by malloc has no declared type. For formal semantic and technical reasons in the C standard, it takes on an effective type once you store data into it. This is largely irrelevant to ordinary use as long as you are not trying to do anything “funny” with the memory, such as using it as different types at different types.

    If not, then how can the following work…

    Once you have used malloc to allocate sufficient memory for an object, you may store a value for that object into the allocated memory.

    … if the memory allocated by the malloc() call is not declared as elements of struct node?

    The effective type of the memory is determined by the type of the lvalue expression used to store to the memory. An lvalue is an expression that potentially designates an object. For a declared object, as with int x;, the name of the object, x, is an lvalue expression for it. When we have a pointer, as with int *p;, then *p is an lvalue expression for the object that p points to (assuming it is a valid pointer to memory for such an object, not a null pointer or an invalid pointer).

    Have assigned new_node to point to memory allocated with malloc, then *new_node is an lvalue expression for the appropriate type, struct node, and new_node->value is an lvalue expression for the int member of struct node. Using these lvalue expressions informs the compiler about how to treat the memory at those locations.