Search code examples
cmemorycompiler-constructionstackv8

How does a JIT compiler like v8 structure its memory (i.e. the stack, heap, code, and data)?


I am trying to figure out how Linux stores the stack, or C even, since each process can have its own stack. Every diagram shows the assembly instructions and registers used in managing an extremely primitive stack (only 1 stack as well), so I'm in doubt that this is how higher-level systems do it (i.e. systems with multiple threads, processes, fibers, workers, coroutines, generators, etc.). Which is why this question is about v8....

How does v8 (at a high level, maybe linking to some related code if it's useful) structure the stack? How does it structure the heap? And code and data. You always see diagrams like this:

enter image description here

But (a) is that how a production JIT system like v8 does it? And (b) what is the actual implementation of the key structs? What data structures are used? Is the stack actually a linked list of some sort? Or is it an array? Is the heap broken down into a doubly-linked list of blocks of a certain size? Are they allocated in physically contiguous blocks of memory (in v8), or are they using OS-specific syscalls like mmap on Linux to allocate memory, which it then structures in its own way using some sort of block architecture (which is what I would like to know about)? Or is it just relying on malloc and such?

Basically, I am interested in simulating how a production JIT compiler works and am stuck at the beginning part of how the memory is structured (with structs and such). I am not sure if the stack is actually a linked-list in reality, or something else. Am very much interested to know.

The reason I think these diagrams are a lie is because we are usually dealing with virtual memory here, the physical memory is abstracted away. And even then, the virtual memory is divided into small pages, which might be scattered around or something, I don't know. So I feel these visuals are totally incorrect and misleading, and would like to know what it actually looks like.


Solution

  • How does v8 structure the stack?

    Just like any other program: rbp points at the base of the current function's stack frame, and rsp points at the "top" (growing downwards, though); the push and pop instructions are used to write/read from where rsp points to while decrementing/incrementing rsp accordingly. There's no data structure, just raw memory.

    Is the heap [...] using OS-specific syscalls like mmap on Linux to allocate memory

    Yes. V8's managed heap is allocated one page at a time using mmap, or equivalents on other operating systems. Note that what such diagrams call "heap" is heap-allocated memory in the C++ sense, not the managed heap in a VM sense. One could argue that the latter is just a special part/case of the former though, and whether you use mmap or malloc is an implementation detail that doesn't really matter (beyond possibly certain performance differences, depending on usage patterns).

    I think these diagrams are a lie is because we are usually dealing with virtual memory here

    I'd put it the opposite way: virtual memory is why these diagrams are (slightly simplified but) mostly accurate. Every process thinks that it is alone in the world, it doesn't see any other processes in its memory address space (just a bit of the kernel, for interacting with it). The memory that a process sees does look like raw memory; it's just that the addresses aren't the physical addresses, there is a translation layer in between (which is managed by kernel + CPU).