c x86-64 computer-science cpu-architecture cpu-registers

Why does creating a pointer of a local variable require the procedure to allocate space on the stack?

I was reading the third chapter of "Computer Systems: A Programmer’s Perspective." In the section "Local Storage on the Stack," the book says:

Most of the procedure examples we have seen so far did not require any local storage beyond what could be held in registers. At times, however, local data must be stored in memory. Common cases of this include these: The address operator ‘&’ is applied to a local variable, and hence we must be able to generate an address for it.

I don't understand the reason for this. Consider this example from the book:

long caller()
{
    long arg1 = 534;
    long arg2 = 1057;
    long sum = swap_add(&arg1, &arg2);
    long diff = arg1 - arg2;
    return sum * diff;
}

The function swap_add requires two pointer arguments, so the caller needs to allocate space on its stack for the addresses of the local variables arg1 and arg2. I understand that you cannot have a pointer to reference a register, but I don't understand the reason for that.

Why we can't store arg1 and arg2 in registers and use &arg1 and &arg2 to reference them? What is the consequence of doing so? The book focuses on x86-64, but I would love to know about other architectures as well.

Solution

Why does creating a pointer of a local variable require the procedure to allocate space on the stack?

The book is distinguishing between local variables that have storage allocated for them in main memory and those stored only in registers. This is not a distinction you can actually see in C, but there are a few C features that affect it. Chief among these is applying the unary & operator to obtain the address of a local variable.

The reason is simple: for the & operator to obtain an object's address, that object must have an address. Only objects with storage assigned to them in memory do. Or at least, that's the presumption on which the book is relying, and it is true on a wide variety of current and historical architectures.

Bear in mind, too, that among the main reasons for wanting the address of a local variable is to provide for non-local accesses to it. For example, to enable scanf() to store a value in it. But when another function has control, it can use the CPU's registers how it wants, with a few caveats, and for the most part, it is not aware of how other functions further up the call chain may be using them.

That does not mean that a function cannot hold a variable's value in a register for a time, or even most of the time, but if it is assigned to memory then there are times when that value must be read from or written to memory.