Most if not all programs nowadays have a "stack" for objects with a size known at compile time, and a "heap" for dynamically allocated objects.
Where is this implemented in practice? Is at the CPU, OS, or program level? Does every program have to ship with some code to handle stack and heap management?
Could a new programming language come along that uses a completely different memory paradigm and still run on current OSs?
After taking an OS course and a Compilers course, I can answer the question from the perspective of a low-level (C/C++/Rust) program, on Linux and MIPS architecture:
When a program starts, it has the address of the base/start of its stack. This base address is either chosen by the OS or the program. Let's imagine program X's stack starts at 0x00. From Linux's point of view, it allocates a memory region from 0x00 to 0x10 (for example). Program X is free to read/write to this region however it wants. We then call the region from 0x00 to 0x10 the stack of program X.
Now if you look at the assembly code of any C program, you will see it manually increments and decrements the stack pointer, stores the return address before calling a function, etc. So it's very much the program that "drives" how the stack is used. It still has some help from the CPU hardware and instruction set, with things like a "stack pointer" CPU register. But overall it's the program's responsibility.
But what about the heap?
Let's imagine that later on, X wants to malloc
(heap-allocate) some memory. malloc()
is a function in glibc, which calls the Linux mmap
syscall (or sbrk
, but we'll focus on mmap
). mmap
basically says "Hello Linux kernel, could you please give me the memory between 0X50 and 0x70?" (Note that 0x50 and 0x70 are values chosen by malloc
).
Linux double-checks X is not already using that memory region. If it is not, Linux replies "Sure thing! You can now use that memory region as you wish, have fun!". Now, program X can read/write to it as much as it wants.
That's the simple version. To dig a bit deeper, note I didn't say "Linux checks no one else is using that region". Linux only checks X isn't using it. Well what if program Y is using it already? The answer is that X's 0x50 is different from Y's 0x50, and even though they are the same number, the hardware knows (via page tables) how to translate X's or Y's virtual address into the corresponding physical RAM address.
There is one more thing to add. This explanation makes it seem that Linux gives program X different parts of memory, and doesn't care which one is the stack or the heap. That is almost true. There is one main difference between the stack and the heaps, and that is that the stack can grow.
Take our values from above. If program X tries to write to value 0x71, the hardware will notice that X hasn't allocated that address yet (the page table entry is null), and tells Linux in the form of a "page fault". Linux checks again, goes "yep, that's an error" and kills the program with a segfault.
Now say that instead, the program has been running for a while and X used up all of its stack. So it tries to write to 0X11 (higher than 0x10). Just like before, the hardware page-faults, and Linux checks to see what happened. but it notices that right next to 0x11, there's this memory region from 0x00 to 0x10 that is marked as GROWSUP
. So instead of segfaulting, it extends the memory region from 0x10 to 0x20, and lets the program continue as if nothing happened.
This GROWSUP
attribute is the only difference between the stack and the heap from Linux's point of view.
Of course this is a simplified explanation. Once inaccuracy is that in MIPS (and most other architectures) the stack grows downwards, not upwards. But I found that hard to wrap my head around while familiarizing myself with these concepts.
Another thing to note is that Linux doesn't actually allocate anything when calling mmap
. But to find out how it pulls off that magic trick, see this excellent post.