Search code examples
x86armcomputer-sciencecpu-architecture

When does "caller save" becomes a MUST?


I'm reading H&P: Computer Architecture a Quantitative Approach 6th ediiton.

In App.A page A-19 a paragraph states:

There are times when caller save must be used because of access patterns to globally visible variables in two different procedures. For example, suppose we have a procedure P1 that calls procedure P2, and both procedures manipulate the global variable x. If P1 had allocated x to a register, it must be sure to save x to a location known by P2 before the call to P2. A compiler’s ability to discover when a called procedure may access register-allocated quantities is complicated by the possibility of separate compilation.

My question is: How is that possible? Isn't global variables are placed in the global data segment and they have their own memory address? How can they be allocated to registers?

I tried to research to understand this but didn't find anything useful! Thanks in advance for helping.


Solution

  • Compilers can sometimes see the internal implementation of a function that is called by another function it is currently working on.  In some of those cases, it can manipulate the calling convention and also use additional alias analysis for improvement, knowing the implementation details of both caller and callee.

    However, the use of the term "caller saves" seems out of line in describing these scenarios.  Caller saves generally used to refer to a specific subset of the registers that have been designated as scratch registers, aka call clobbered.  These scratch registers are presumed wiped out by a function call, so in normal function calling (when the callee's internal implementation details are unknown) they should not be used to hold data that is live across such function call.  If needed, compilers will move such data to memory or to call preserved (callee saves) registers.  Though they can, they don't necessarily reload to the same register immediately after the call, so the caller-saves pattern is not always used they way the phrase implies.

    Callee saves register, better described as call preserved registers, are used in a pattern of save in function prologue, use at will in function body, restore their original (function entry) value in function epilogue.  Their advantage is in not having to save/restore inside the function body — instead just once in prologue/epilogue.  This is particularly useful when the function body contains loops.  The pattern requires saving original values in registers and restoring those original values back to the exact registers in returning to the caller.

    The implication of the phrase caller-saves is of a similar pattern — save caller-saves register (say, to local stack memory), make a call, then restore that register.  However, compilers rarely use that pattern, instead using scratch registers for data that isn't live across a call, and using other storage (memory or call preserved registers) for data that is live across the call.

    If P1 caches global x in a register then it needs to save it back to the known location before a call that it makes to P2, if P2 uses x (or it isn't known whether or not P2 uses x).  This applies whether P1 caches x in a callee- or caller- saves register (or in local stack memory for some reason).

    It seems confusing to use the well-known term "caller saves" (which designates the scratch register subset or a register in this subset) for the scenario they are describing.  The term caller saves it seems to me, is more about the preservation requirements across a call for one's own values rather than saving of cached memory-based (global or other) variables to share with a caller or callee.