Search code examples
cmemorylanguage-lawyerraw-dataalignof

Effective type in packed raw data


In order to avoid memory fragmentation on very large datasets, I have implemented a doubly linked list that avoid calling malloc twice: one malloc for the data and another one for the prev and next nodes. Instead, it allocates the required space in one shot using alignof to get the offset of the struct containing the prev and next nodes.

The implementation is here but extracting the relevant part:

#include <stdlib.h>
#include <stdint.h>
#include <stdalign.h>

struct node
{
    struct node *prev;
    struct node *next;
};

typedef struct
{
    struct node *head;
    struct node *tail;
    size_t offset;
    size_t size;
} klist;

klist *klist_create(size_t size)
{
    klist *list = calloc(1, sizeof *list);

    if (list != NULL)
    {
        size_t align = alignof(struct node);

        // Round size up to nearest multiple of alignof(struct node)
        list->offset = (size + (align - 1)) / align * align;
    }
    return list;
}

#define klist_node(list, data) ((void *)((uintptr_t)(const void *)data + list->offset))
#define klist_data(list, node) ((void *)((uintptr_t)(const void *)node - list->offset))

void *klist_push_head(klist *list)
{
    void *data = calloc(1, list->offset + sizeof(struct node));

    if (data == NULL)
    {
        return NULL;
    }

    struct node *node = klist_node(list, data);

    if (list->head != NULL)
    {
        list->head->prev = node;
        node->next = list->head;
    }
    else
    {
        list->tail = node;
    }
    list->head = node;
    list->size++;
    return data;
}

void *klist_head(const klist *list)
{
    if (list->head != NULL)
    {
        return klist_data(list, list->head);
    }
    return NULL;
}

...

Then, in main:

struct data
{
    int key;
    char *value;
};

klist *list = klist_create(sizeof(struct data));
struct data *data = klist_push_head(list);

data->key = 1;
data->value = "one";

where data can be a pointer to any primitive or composite type.

The thing is that not being the typical packaged structure containing all the members involved:

struct node
{
    void *data;
    struct node *prev;
    struct node *next;
};

I am concerned about the effective type rule:

If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value.

How does this rule affect the implementation of the list?

Is it legal/portable code?


Solution

  • I do not see clearly all aspects of OP's approach short-comings, yet certain parts like addition of integers pointers via (uintptr_t)(void*) is not specified to work to form the desired final pointer.


    An alternative is to use a flexible member array which will handle padding issues too.

    Something like the below.

    // Error checking omitted for brevity.   
    
    struct node {
      struct node *prev;
      struct node *next;
      max_align_t data[]; // FMA member at worst case alignment.
    };
    
    typedef struct {
      struct node *head;
      struct node *tail;
      size_t data_size;
      size_t size;
    } klist;
    
    klist* klist_create(size_t data_size) {
      klist *list = calloc(1, sizeof *list);
      list->data_size = data_size;
      return list;
    }
    
    struct node* klist_push_head(klist *list) {
      struct node *nd = calloc(1, sizeof *nd + list->data_size);
      if (list->head) {
        list->head->prev = nd;
        nd->next = list->head;
      } else {
        list->tail = nd;
      }
      list->head = nd;
      list->size++;
      return nd;
    }
    
    #define klist_data(/* struct node* */ nd) ((void *)&((nd)->data))