What happens exactly under the hood to Assembly `push`, `pop`, `call`, and `ret` operations?

I am trying to understand in detail how to implement a robust stack/register machine (sort of a hybrid I guess): How to simulate a call stack in JavaScript using only a single array. The answer there shows this:

const memory = [
  8, // initial program counter
  0,
  0,
  0,
  0,
  0,
  0,
  0,
  "push",
  1,
  "push",
  2,
  "add",
  "push",
  3,
  "add",
  "print",
  "exit",
  0,
  0,
]

In other resources I've seen they all only go as low-level as calling push, pop, call, or ret. They don't show how those are implemented. Here is another example:

But if I were to write a simulator for the entire computer, including the push, pop, call, and ret instructions, what do those look like under the hood? How do they know which place in memory is free to store the next push? What happens exactly to the memory when you call pop? How do the things on the stack actually get passed to the function being called with call? Etc.. Basically, how do these things work under the hood. I understand that they are implemented in the hardware, but if you were to implement these in code (using only a memory array, like the above question), what would they look like?

Solution

The stack pointer is a CPU register (here %esp) that must be initialized to refer to the area of memory to be used as the stack. This area needs to be mutable, and sufficiently large for the program to make calls passing parameters.

This stack pointer register is then the reference for push/pop operations:

Push operations decrement the stack pointer and write the value-to-push at that address, and,
Pop operations read the value where the stack pointer refers, and then increment the stack pointer.

The stack pointer (among other registers) is initialized before the process calls main. This is either done by the operating system, or by __start, in crt0.o that calls main.

A proper simulator will have to simulate some of the operating system behavior and/or __start behavior, e.g. putting a valid initial value into the stack pointer register, before calling main.

How do they know which place in memory is free to store the next push?

The stack pointer is the reference: its value indicates the address of the first item on the stack.

What happens exactly to the memory when you call pop?

Nothing happens to the memory, just that the stack pointer is updated (the register changes in value so that where it points is moved) so that the memory is now taken as being available (e.g. for another push). Memory below the stack pointer is considered free stack area, and memory at and above the stack pointer is considered in use stack area.

How do the things on the stack actually get passed to the function being called with call?

Both the calling function (the caller) and called function (the callee) have access to the CPU stack register.

The callee knows that there is a return address and parameters on the stack. The return address is at the top of the stack — it is what the stack pointer points directly at. Thus stack pointer's address + 4 then points to parameter #1, +8 to parameter #2, etc..

A simulator simulating a CPU would simulate both memory and CPU registers, such as a stack pointer, and program counter (aka instruction pointer). The instruction pointer holds the address of the next instruction to execute, and it is how the processor manages flow of control.

To simulate a push instruction, the simulator will decrement the simulated stack pointer register, then write the value-to-push at the address held in that stack pointer, and, also increment the program counter register in preparation for execution of the next sequential instruction following the push

The simulator does the reverse for pop: it reads from the simulated memory at location referred to by the simulated stack pointer register, followed by incrementing that simulated register. It will also increment the program counter in preparation for executing the next sequential instruction following the pop.

The call instruction has the behavior of a push, where the value pushed is the simulated program counter — modified so that it refers to the instruction immediately following the call instruction — which is the return address: the location to resume the caller when the callee is completed. Unlike the push and pop instructions, the call instruction also changes the simulated program counter register so that the next instruction to execute is the first of the called function.

The ret instruction has the behavior of a pop, but pops into the simulated program counter, so this this changes the flow of control such that the next instruction to execute is back in the caller — one instruction past the original call.

Hopefully, you can see how push and pop are duals, as is call and ret.

One last thing, if you want to allocate stack memory without specifying values to push, we can subtract from the stack pointer — and to release stack memory without popping, we can add to the stack pointer. This addl instruction following call plus does this — it pops both parameters originally pushed reclaiming two pushes (worth), without retrieving their values.