Search code examples
cmemorymallocdmalloc

malloc implementation : checking for correct allocation alignment


my latest test is failing, but i think i've found a solution to pass it. it involves using alignof(max_align_t) (or even alignof(long double)), but i'm not entirely sure if my understanding of allocation alignment is correct..

my idea is that if the actual size of my type is not divisible by the address position, i need to add some padding until the condition is satisfied. this process helps the compiler access memory more efficiently by creating a predictable pattern!

if i'm right, so why the last test is not passing? otherwise, clarification would be appreciated!

void *malloc(size_t sz, const char *file, int line) {
  (void)file, (void)line; // avoid uninitialized variable warnings
  if (default_buffer.pos + sz > default_buffer.size) {
    // Not enough space left in default buffer for allocation
    gstats.nfail++;
    gstats.fail_size += sz;
    return nullptr;
  }
  // Here i'm checking for correct allocation alignement
  if (default_buffer.pos % sz != 0) {
    default_buffer.pos += default_buffer.pos % sz;
  }
  // Otherwise there is enough space; claim the next `sz` bytes
  void *ptr = &default_buffer.buffer[default_buffer.pos];

  default_buffer.pos += sz;
  gstats.ntotal++;
  gstats.total_size += sz;
  return ptr;
}

test.c

int main() {
double* ptr = (double*) malloc(sizeof(double));
assert((uintptr_t) ptr % alignof(double) == 0);
assert((uintptr_t) ptr % alignof(unsigned long long) == 0);
assert((uintptr_t) ptr % alignof(std::max_align_t) == 0);

    char* ptr2 = (char*) malloc(1);
    assert((uintptr_t) ptr2 % alignof(double) == 0);
    assert((uintptr_t) ptr2 % alignof(unsigned long long) == 0);
    assert((uintptr_t) ptr2 % alignof(std::max_align_t) == 0);
    
    free(ptr);
    free(ptr2);

}

enter image description here

I've already solved the problem, and I want to understand why an alternative to my solution is not working on a specific test. So basically, I'm seeking to ensure my understanding is correct!


Solution

  • As I mentioned in the comments, your alignment calculation is not correct. Imagine you want an alignment of 4, and the residual of modulo is 3. You would not add another 3 -- you would add 4 minus 3.

    Instead of using modulo, the typical way to align a value is to add one less than the alignment (so that everything except zero is guaranteed to result in an address at the next alignment boundary), and then mask off all the bits lower than the alignment:

    pos = (pos + align - 1) & ~(align - 1);
    

    Note that align must be an unsigned power of 2.

    Since you're trying to align a position in a buffer here, you must also be careful that this actually results in the same alignment in memory.

    Regarding automatic alignment, the user Eric Postpischil also commented:

    An object’s alignment is not greater than the greatest power of two that divides its size. E.g., if an object’s size is 12 bytes, its alignment requirement is at most 4 bytes. There is no need to go to 16. The greatest power of two that divides size may be calculated as size & -size.

    So, putting all this together here's a simple way you can align to an arbitrary address:

    #include <stddef.h>
    #include <stdint.h>
    
    size_t max_align(size_t size)
    {
        size_t alignment = size & -size;
        return ((alignment - 1) & (sizeof(max_align_t) - 1)) + 1;
    }
    
    void* align(void* ptr, size_t alignment)
    {
        return (void*)(((uintptr_t)ptr + alignment - 1) & ~(alignment - 1));
    }
    

    Note the dirty trick for trimming down to sizeof(max_align_t). This may or may not be better than simply introducing a branch. I'm not much of a bit-basher!

    See it in action:

    void test(char* buffer, size_t size)
    {
        size_t alignment = max_align(size);
        char* aligned = align(buffer, alignment);
        size_t offset = aligned - buffer;
        printf("-align=%p size=%zu alignment=%zu offset=%zu\n",
            aligned, size, alignment, offset);
    }
    
    struct foo {
        int a;
        char b[2];
        int c;
        void* d;
    };
    
    struct __attribute((packed)) bar {
        int a;
        char b[2];
        int c;
        void* d;
    };
    
    int main(void)
    {
        char buffer[256];
        for (size_t offset = 0; offset < 64; offset += 13) {
            char* base = buffer + offset;
            printf("\nbuffer=%p :\n", base);
            test(base, sizeof(struct foo));
            test(base, sizeof(struct bar));
        }
    }
    

    Example output:

    
    buffer=0x7ffc98db0770 :
    -align=0x7ffc98db0770 size=24 alignment=8 offset=0
    -align=0x7ffc98db0770 size=18 alignment=2 offset=0
    
    buffer=0x7ffc98db077d :
    -align=0x7ffc98db0780 size=24 alignment=8 offset=3
    -align=0x7ffc98db077e size=18 alignment=2 offset=1
    
    buffer=0x7ffc98db078a :
    -align=0x7ffc98db0790 size=24 alignment=8 offset=6
    -align=0x7ffc98db078a size=18 alignment=2 offset=0
    
    buffer=0x7ffc98db0797 :
    -align=0x7ffc98db0798 size=24 alignment=8 offset=1
    -align=0x7ffc98db0798 size=18 alignment=2 offset=1
    
    buffer=0x7ffc98db07a4 :
    -align=0x7ffc98db07a8 size=24 alignment=8 offset=4
    -align=0x7ffc98db07a4 size=18 alignment=2 offset=0