linux memory kernel heap-memory stack-memory

Stack size on Linux - Limited stack size vs automatic stack expansion

I remember first time studying about memory handling from my books in my teen ages, the following example:

void foo()
{
    int a[1000000]; // waste of stack
    // ...
}

void foo()
{
    int* ptr = malloc(1000000 * sizeof(int)); // much better, because the size of stack is limited
    // ...
}

(Yeah I heard about sometimes the compiler may optimize the second code into the first one)

I always imagined stack as a "fixed, limited" memory space, my old books stating "stack is very small, do not waste it", however heap can utilize RAM as many needed (the limitation comes from the obvious hardware).

However:

https://unix.stackexchange.com/questions/63742/what-is-automatic-stack-expansion

Dynamic expansion of the Linux stack

https://www.youtube.com/watch?v=7aONIVSXiJ8 47:05

So if I understand well, the stack size is not the main reason why we choose heap? Okey, we still need malloc()/calloc() because we don't know the array size compile time (and want to avoid VLA). But except this reasoning, are there any other reason why we store our data instances on the heap?

I always imagined heap is expansed as we malloc() more. Is this correct?

I'm confused about if the stack is growing dynamically, then why stack overflow can occur?

Here's a picture about a Linux program in the virtual memory:

https://cdn-images-1.medium.com/max/1200/1*8b9-Z3FV6X9SP9We8gSC3Q.jpeg

Stack overflow happens when the virtual memory space exhasted between the two arrows?

The youtube video (https://www.youtube.com/watch?v=7aONIVSXiJ8) says at 47:40 stack expansion may result in the newly allocated page won't be physically contigious with the rest of the stack. The newly allocated page will be mapped into the process's stack, so the virtual memory layout of the program remains flat. I this correct?
The heap grows upside (for example when malloc()). But again, this concludes that the size of heap is limited (above by the stack, bottom by the BSS). How is it possible to utilize the whole RAM by heap? The only why I can image this if the low address is 0, and the high address is the highest memory address available by the RAM. One process may "see that the whole RAM is available with pointers". These are virtual memory adresses, so the CPU translates these into phyiscal addresses. Am I understand well?
As both stack and heap grows dynamically as we allocate more, we don't have to care about to increment the program breake with hand with the brk() or alloc on the stack with alloca() functions. These are low-level functions called by the Kernel, and we don't call directly these functions. Is this correct?

I would be grateful if you answer if I understand correctly.

Solution

The stack can grow, but not indefinitely. As shown in the diagram in the first question you link to, the stack and the heap can both grow into the free memory area, but if they keep growing eventually they'll run into each other.

It's not common to write programs that need an enormous amount of stack growth. Often if a program keeps growing the stack, it's an indicator a bug causing infinite recursion. Limiting the stack size catches these errors. While there are some algorithms that recurse deeply, they're not common, and can often be refactored into iterative algorithms.

The problem with int a[1000000]; isn't that it "wastes" the stack. Most architectures have a relatively small limit to the size of a single stack frame, so you can't have such large arrays as local variables.

But except for that, the usual reason to choose heap versus stack memory is related to how the data will be used. Variables on the stack are declared statically in the program code. If you need a variable number of objects, it's usually necessary to use the heap (C has variable-length arrays, but C++ doesn't, and you can't resize them like you can with realloc()). Also, memory allocated on the stack goes away when the function returns, so you must use heap objects for data that persists beyond a single function.

Don't worry about what's he's talking about at 47:40 in the video. Application programs just deal with virtual memory, physical memory is totally hidden and only relevant to the internals of the virtual memory subsystem in the kernel.

The process break is used by the runtime library's implementation of malloc(). You don't normally deal with it directly in application programs.