Search code examples
operating-systemcpu-architecturepaginglow-level

Address translation of a instruction of multiple bytes


Hi my question is simple : If you have a 4 byte instruction and your operating system makes use of paging. Is it possible that multiple address translations are made to fetch this 4 byte instruction?

I searched de ostep course book but could not find clarification. Thanks in advance.


Solution

  • Traditional ISAs with fixed-width 4-byte instructions, such as MIPS, require them to be naturally aligned (aligned by 4, low 2 address bits = 0), so they can't span across any larger power-of-2 alignment boundary such as a page.

    But there are plenty of ISAs with variable-length instructions where one instruction can span a page boundary, so with virtual memory two different virt->phys translations are needed. (Typically CPUs fetch blocks like 16 bytes into a buffer or queue and decode out of that, but yes you'd still have two address translations as part of fetching the bytes of one instruction. Having a fetch buffer is what makes this a non-problem for the most part.)

    • x86 and other CISCs like m68k - machine code is an unaligned byte-stream. For x86, instruction lengths from 1 to 15 bytes are allowed.
    • ARM Thumb 2, RISC-V rv32c / rv64c, and similar for other RISCs that support compressed instructions (like MicroMIPS) to save i-cache footprint and fetch bandwidth: instructions are 2 or 4 bytes, aligned by 2.
    • ForwardCom (exists on paper only) - instructions are one, two, or three 32-bit words. (https://www.forwardcom.info/risc_cisc.php)

    See Do x86 instructions require their own encoding as well as all of their arguments to be present in memory at the same time? which discusses the worst-case number of pages present at once for forward progress to be possible, including an instruction spanning a page boundary.


    You might also consider cases like a MIPS lw $t0, 0($t1) which page-faults, then after the OS's page-fault handler repairs the situation and returns to user-space, the instruction is re-fetched, needing another address translation.

    So that counts as 2 fetches. If lots of tasks are running (and the OS context-switches to something else during the page fault), the code or data might even get evicted again before this task runs again and returns to user-space for code-fetch or the data load to fault again. So the number of address-translations is theoretically unbounded for one successful execution.

    But still only 1 translation per fetch. (Or 2 for an instruction split across pages.)