c++visual-c++visual-studio-2015 x86-64 dynamic-memory-allocation

How can I make single object larger than 2GB using new operator?

I'm trying to make a single object larger than 2GB using new operator. But if the size of the object is larger than 0x7fffffff, The size of memory to be allocated become strange. I think it is done by compiler because the assembly code itself use strange size of memory allocation.

I'm using Visual Stuio 2015 and configuration is Release, x64.

Is it bug of VS2015? otherwise, I want to know why the limitation exists.

The example code is as below with assembly code.

struct chunk1MB
{
    char data[1024 * 1024];
};

class chunk1
{
    chunk1MB data1[1024];
    chunk1MB data2[1023];
    char data[1024 * 1024 - 1];
};

class chunk2
{
    chunk1MB data1[1024];
    chunk1MB data2[1024];
};

    auto* ptr1 = new chunk1;
00007FF668AF1044  mov         ecx,7FFFFFFFh  
00007FF668AF1049  call        operator new (07FF668AF13E4h)  

    auto* ptr2 = new chunk2;
00007FF668AF104E  mov         rcx,0FFFFFFFF80000000h  // must be 080000000h
00007FF668AF1055  mov         rsi,rax  
00007FF668AF1058  call        operator new (07FF668AF13E4h)

Solution

Use a compiler like clang-cl that isn't broken, or that doesn't have intentional signed-32-bit implementation limits on max object size, whichever it is for MSVC. (Could this be affected by a largeaddressaware option?)

Current MSVC (19.33 on Godbolt) has the same bug, although it does seem to handle 2GiB static objects. But not 3GiB static objects; adding another 1GiB member leads to wrong code when accessing a byte more than 2GiB from the object's start (Godbolt -
mov BYTE PTR chunk2 static_chunk2-1073741825, 2 - note the negative offset.)

GCC targeting Linux makes correct code for the case of a 3GiB object, using mov r64, imm64 to get the absolute address into a register, since a RIP-relative addressing mode isn't usable. (In general you'd need gcc -mcmodel=medium to work correctly when some .data / .bss addresses are linked outside the low 2GiB and/or more than 2GiB away from code.)

MSVC seems to have internally truncated the size to signed 32-bit, and then sign-extended. Note the arg it passes to new: mov rcx, 0FFFFFFFF80000000h instead of mov ecx, 80000000h (which would set RCX = 0000000080000000h by implicit zero-extension when writing a 32-bit register.)

In a function that returns sizeof(chunk2); as a size_t, it works correctly, but interestingly prints the size as negative in the source. That might be innocent, e.g. after realizing that the value fits in a 32-bit zero-extended value, MSVC's asm printing code might just always print 32-bit integers as signed decimal, with unsigned hex in a comment.

It's clearly different from how it passes the arg to new; in that case it used 64-bit operand-size in the machine code, so the same 32-bit immediate gets sign-extended to 64-bit, to a huge value near SIZE_MAX, which is of course vastly larger than any possible max object size for x86-64. (The 48-bit virtual address spaces is 1/65536th of the 64-bit value-range of size_t).

unsigned __int64 sizeof_chunk2(void) PROC                   ; sizeof_chunk2, COMDAT
        mov     eax, -2147483648              ; 80000000H
        ret     0
unsigned __int64 sizeof_chunk2(void) ENDP                   ; sizeof_chunk2

How can I make single object larger than 2GB using new operator?

This looks like a compiler bug or intentional implementation limit; report it to Microsoft if it's not already known.