Search code examples
cpucpu-architecture

The stage in which the data for the instruction is fetched from memory or cache


I cannot find any official (detailed) information about instruction cycle or instruction pipelining in modern CPU's (especially for AMD Zen+ and newer).

Consider the following instruction:

ADD MEM, REG

In which stage the data for the [mem] operand is fetched from memory? Before (decoding) or in execution stage?


Solution

  • According to Modern Microprocessors A 90-Minute Guide!

    (...) dynamically decode the x86 instructions into simple, RISC-like micro-instructions, which can then be executed by a fast, RISC-style register-renaming OOO superscalar core. (...) Most x86 instructions decode into 1, 2 or 3 μops, while the more complex instructions require a larger number.

    OOO (out-of-order execution), means:

    processor executes instructions in an order governed by the availability of input data and execution units, rather than by their original order in a program. In doing so, the processor can avoid being idle while waiting for the preceding instruction to complete and can, in the meantime, process the next instructions that are able to run immediately and independently.

    source: Wikipedia

    1. Original CISC instruction is decoded into RISC-like μops.
    2. Modern CPU's are superpipelined-superscalar causes instruction level parallelism which means single core is able to execute multiple instructions in parallel (and multiple instructions per clock cycle).
    3. Because data dependecies the order of execution isn't arbitrary, on the presented example (add [mem], eax), ALU part (μop) cannot be executed completely before the the value of [mem] is fetched.
    4. "The memory operand is loaded at execution of a separate μop, before the add μop can execute.", just like @peter-cordes said.