Search code examples
assemblycpu-architecturememory-addressaddressing-moderisc

RISC access address greater than largest integer register


Let's say you are running a 32-bit RISC system. What instructions would you use to access a 64-bit memory address?

In a CISC instruction set, you can simply pass the extra word using a multiword instruction. For example:

1a) JMP
1b) loAddress
1c) hiAddress

Given that RISC instructions are only one word each, how would you access a multi-word address?

Assume the ALU is 32-bit and has a carry flag.

Also, in a CISC system (for example the 8080) both the loAddress and hiAddress words would be stored in the program memory. I.e. the JMP instruction knows to look at the next item in program memory to retrieve the loAddress, and the item after that to retrieve the hiAddress. What happens in RISC?


Solution

  • Even on a CISC, what you describe is quite unusual. It's not because of being CISC, it's because of using addresses wider than registers. This is only usually only found in 8-bit CPUs. (Although x86 segmentation qualifies, too, with indirect far jumps taking a pointer to a m16:32 segment / offset pair. Or in 16-bit mode, m16:16. Being little-endian, the offset is first.) Outside 64-bit mode, jmp ptr16:32 is also encodeable, with the absolute segment:offset as part of the instruction stream.)

    Normally when you want to design a CPU with larger address space, you also make the registers wider so you can deal with addresses efficiently. It's only at the very low end when you want to save transistors by using mostly 8-bit registers / ALUs, but can't limit your address space to 256 bytes, where you find this kind of design.


    There is a real issue here even when the address size matches the word size. Constructing arbitrary 32-bit (or 64-bit) constants is a problem that different ISAs solve different ways. ARM often uses PC-relative loads from a nearby "literal pool", while others often use a lui or equivalent to set the upper 16 bits and zero the rest, then ori with a 16-bit immediate. (ARM has some neat tricks for encoding immediates with only a few bits set, by using a shifted/rotated immediate.)

    In general on a RISC, if you need to jump far away you may need to construct the address in a register using multiple instructions. Then use a jump-to-register instruction.

    MIPS branch instructions are interesting: It has relative branches that add a signed displacement to the program counter with a fairly large range, and absolute jump instructions that replace the low 28 bits of PC with a new address. (Constructed from a 26-bit immediate left-shifted, because MIPS requires instructions to be aligned so the low 2 bits don't need to be stored.) How to Calculate Jump Target Address and Branch Target Address?. But when the target isn't reachable from the current location with those, you need jr with an address in a register.

    x86-64 also lacks a 64-bit relative jump instruction. If you need to jump farther than +-2GiB away (not far as in a new CS segment), you need an indirect jump. Normal jump/branch instructions still use rel8 or rel32 displacements, keeping the machine code compact. The only instruction that can take a 64-bit immediate is mov-to-register. The normal code model assumes that all code within the same library or executable is within 2GiB of each other, so the linker will be able to fill in 32-bit displacements.


    8-bit RISC

    The only RISC ISA I'm aware of with a program counter wider than registers is AVR, a microcontroller with 8-bit registers. It can treat pairs of registers as 16-bit addresses, and its PC is 16-bit. It IJMP (indirect jump) instruction sets PC = Z (where Z is a pair of 8 bit registers). On AVRs with 22-bit program counters instead of just 16, it zeros PC(21:16).

    EIJMP (extended indirect jump) takes the EIND register from I/O space for the high bits of PC, with the low bits still coming from Z.

    AVR instructions are almost all 2 bytes long, but some versions have a 4-byte jmp instruction which takes a 0..4M absolute address for the jump target.


    Mainstream RISC machines with 32-bit registers also have 32-bit program counters and virtual address-spaces. (Having more than 4GiB of physical memory could be possible, but you couldn't map it all at the same time in one process).

    Most of them are heavily word-oriented in their design, so all they need is jr reg (MIPS) or whatever equivalent to branch to any possible address, because it fits in one registers. This is part of the reduced complexity that RISC literally stands for.


    On a normal RISC like MIPS, SPARC, or PowerPC, 64-bit addresses are only available in the 64-bit ISA extension, where you have 64-bit integer registers. So you'd use instructions like MIPS ld $2, 0($3) to do a 64-bit (doubleword) load using $3 as the 64-bit base address. See this MIPS-IV ISA manual. (MIPS-III added 64-bit extensions, with instructions like ld and daddu. Apparently MIPS-I left a lot of its opcode coding space unused, so there was plenty of room for new opcodes to do full 64-bit ALU operations.)

    Some 32-bit CPUs added extensions to support large physical addresses without increasing the virtual address space. For example, x86's PAE defined a new page-table format with 36-bit physical addresses. But even with segmentation, a single process can't address more than 4GiB of virtual memory at a time. (x86 segment base+offset happens before virt->phys translation, creating a 32-bit linear address. So it's still useful for thread-local storage, e.g. with [fs:0] being a different linear address depending on that thread's fs segment base.)


    Extended addressing on 32-bit RISC ISAs

    Paul Clayton comments:

    PA-RISC had "space registers" which provided extended addressing. 32-bit PowerPC had segment registers which were selected based on the most significant 4 bits of the effective address from a 16-entry table (providing a 52-bit virtual address space). For PA-RISC "SRs 5 through 7 can be modified only by code executing at the most privileged level." For PowerPC, any segment register change required privilege.

    So apparently some RISC ISAs did extend their addressing before going fully 64-bit. But I don't know the details and am not planning to take the time to research this. Other answers welcome!