assembly x86-64 cpu-architecture cpu-registers calling-convention

What's the advantage of having nonvolatile registers in a calling convention?

I'm programming a JIT compiler and I've been surprised to discover that so many of the x86-64 registers are nonvolatile (callee-preserved) in the Win64 calling convention. It seems to me that nonvolatile registers just amount to more work in all functions that could use these registers. This seems especially true in the case of numeric computations where you'd want to use many registers in a leaf function, say some kind of highly optimized matrix multiplication. However, only 6 of the 16 SSE registers are volatile, for example, so you'd have a lot of spilling to do if you need to use more than that.

So yeah, I don't get it. What's the tradeoff here?

Solution

If registers are caller-saves, then the caller always has to save or reload those registers around a function call. But if registers are callee-saves, then the callee only has to save the registers that it uses, and only when it knows they're going to be used (i.e. maybe not at all in an early-exit scenario). The disadvantage of this convention is that the callee doesn't have knowledge of the caller, so it might be saving registers that are dead anyway, but I guess that's seen as a smaller concern.