Search code examples
c++memory-alignmentplacement-newmemory-layoutalignas

How exactly does alignment impact memory layout and the bahaviour of placement new?


We read a lot about alignment and how important it is, for example for placement new usage, but I was wondering - how does it exactly alter the layout of the memory?

Obviously, if we do

char buffer[10];
std::cout << sizeof buffer;

and

alignas(int) char buffer[10];
std::cout << sizeof buffer;

we get the same result, which is 10.

But the behaviour cannot be exactly the same, can it? How come it is distinguishable? I tried to seek the answer and ran to godbolt, testing the following code:

#include <memory>

int main() {
    alignas(int) char buffer[10];
    new (buffer) int;
}

which, under GCC 8.2 and no optimisations, results in following assembly:

operator new(unsigned long, void*):
    push    rbp
    mov     rbp, rsp
    mov     QWORD PTR [rbp-8], rdi
    mov     QWORD PTR [rbp-16], rsi
    mov     rax, QWORD PTR [rbp-16]
    pop     rbp
    ret
main:
    push    rbp
    mov     rbp, rsp
    sub     rsp, 16
    lea     rax, [rbp-12]
    mov     rsi, rax
    mov     edi, 4
    call    operator new(unsigned long, void*)
    mov     eax, 0
    leave
    ret

Let's change the code slightly by removing the alignas(int) part. Now, the generated assembly is slightly different:

operator new(unsigned long, void*):
    push    rbp
    mov     rbp, rsp
    mov     QWORD PTR [rbp-8], rdi
    mov     QWORD PTR [rbp-16], rsi
    mov     rax, QWORD PTR [rbp-16]
    pop     rbp
    ret
main:
    push    rbp
    mov     rbp, rsp
    sub     rsp, 16
    lea     rax, [rbp-10]
    mov     rsi, rax
    mov     edi, 4
    call    operator new(unsigned long, void*)
    mov     eax, 0
    leave
    ret

Notably, it differs only by lea instruction, where the second parameter is [rbp-10] instead of [rbp-12], as we had in the alignas(int) version.

Please do note that I generally do not understand assembly. I cannot write assembly but I can somewhat read it. To my understanding, the difference simply alters the offset of a memory addres, which will hold our placement-newed int.

But what does it achieve? Why do we need that? Suppose we have a 'generic' representation of the buffer array as follows:

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

Now, I would assume, that after placement-newing the int (with or without the alignment), we would end up with something like this:

[x] [x] [x] [x] [ ] [ ] [ ] [ ] [ ] [ ]

where x represents a single byte of an int (we assume that sizeof(int) == 4).

But I must be missing something. There is more to that and I do not know what. What exactly do we achieve by aligning the buffer to int suited alignment? What happens if we don't align it so?


Solution

  • On some architectures, types must be aligned in order for operations to work correctly. The address of int, for example, might need to be a multiple of 4. If it isn't so aligned, then CPU instructions that operate on integers in memory wont work.

    Even if everything works when data is not nicely aligned, alignment is still important for performance, because it ensures that integers, etc., don't cross cache boundaries.

    When you align your char buffer to an integer boundary, it doesn't affect the way that placement new works. It just ensures that you can use placement new to put an int at the start of your buffer without violating any alignment constraints. It does this by ensuring that the address of the buffer is a multiple of 4 bytes.