I'm trying to make a single object larger than 2GB using new operator. But if the size of the object is larger than 0x7fffffff, The size of memory to be allocated become strange. I think it is done by compiler because the assembly code itself use strange size of memory allocation.
I'm using Visual Stuio 2015 and configuration is Release, x64.
Is it bug of VS2015? otherwise, I want to know why the limitation exists.
The example code is as below with assembly code.
struct chunk1MB
{
char data[1024 * 1024];
};
class chunk1
{
chunk1MB data1[1024];
chunk1MB data2[1023];
char data[1024 * 1024 - 1];
};
class chunk2
{
chunk1MB data1[1024];
chunk1MB data2[1024];
};
auto* ptr1 = new chunk1;
00007FF668AF1044 mov ecx,7FFFFFFFh
00007FF668AF1049 call operator new (07FF668AF13E4h)
auto* ptr2 = new chunk2;
00007FF668AF104E mov rcx,0FFFFFFFF80000000h // must be 080000000h
00007FF668AF1055 mov rsi,rax
00007FF668AF1058 call operator new (07FF668AF13E4h)
Use a compiler like clang-cl that isn't broken, or that doesn't have intentional signed-32-bit implementation limits on max object size, whichever it is for MSVC. (Could this be affected by a largeaddressaware option?)
Current MSVC (19.33 on Godbolt) has the same bug, although it does seem to handle 2GiB static objects. But not 3GiB static objects; adding another 1GiB member leads to wrong code when accessing a byte more than 2GiB from the object's start (Godbolt -
mov BYTE PTR chunk2 static_chunk2-1073741825, 2
- note the negative offset.)
GCC targeting Linux makes correct code for the case of a 3GiB object, using mov r64, imm64
to get the absolute address into a register, since a RIP-relative addressing mode isn't usable. (In general you'd need gcc -mcmodel=medium
to work correctly when some .data / .bss addresses are linked outside the low 2GiB and/or more than 2GiB away from code.)
MSVC seems to have internally truncated the size to signed 32-bit, and then sign-extended. Note the arg it passes to new
: mov rcx, 0FFFFFFFF80000000h
instead of mov ecx, 80000000h
(which would set RCX = 0000000080000000h by implicit zero-extension when writing a 32-bit register.)
In a function that returns sizeof(chunk2);
as a size_t
, it works correctly, but interestingly prints the size as negative in the source. That might be innocent, e.g. after realizing that the value fits in a 32-bit zero-extended value, MSVC's asm printing code might just always print 32-bit integers as signed decimal, with unsigned hex in a comment.
It's clearly different from how it passes the arg to new
; in that case it used 64-bit operand-size in the machine code, so the same 32-bit immediate gets sign-extended to 64-bit, to a huge value near SIZE_MAX, which is of course vastly larger than any possible max object size for x86-64. (The 48-bit virtual address spaces is 1/65536th of the 64-bit value-range of size_t).
unsigned __int64 sizeof_chunk2(void) PROC ; sizeof_chunk2, COMDAT
mov eax, -2147483648 ; 80000000H
ret 0
unsigned __int64 sizeof_chunk2(void) ENDP ; sizeof_chunk2