Is there a ruleset on when forwarding in MIPS takes place?

I am currently reading two : Computer Organisations and Design by Patterson and Computer Architecture: A quantitative approach from Hennesey, and was wondering about forwarding.

In the books it is assumed that we will forward the result always to the required ALU. In what circumstances are we forwarding to other stages considering a MIPS 5-stage in-order pipeline, with forwarding paths of course. Or are we always forwarding as soon as we can?

I tried to look at several slides from other universities, also looked at some YT lectures from different professors accross the globe, though this specific topic wasnt covered.

Solution

Is there a ruleset on when forwarding in MIPS takes place?

Yes, theoretically for the purposes of coursework, one way to know is if the pipelined processor would give different results than a single cycle processor, which does not overlap execution of instructions.

The designers could use stalls instead of forwards, but that would frequently mitigate the advantages of pipelining, so we should assume that stalls are only used when strictly necessary.

The main thing to know is in a classic 5-stage pipeline, that for a RAW hazard mitigated by forwarding, the dependency distance (between the writer and reader for a RAW hazard) is one or two instructions, that is either back-to-back or separated by at most one instruction. This is because when there are 2 instruction in between the writer and the reader, then the WB stage of the writer will occur in the same cycle as the ID stage of the reader, and the reader will see the written value, whereas with fewer than 2 instructions in between, the reader will see a stale value, since their ID stage occurs earlier than the writer's WB stage.

(See this for a discussion on that.)

There's EX to EX for back-to-back ALU operations:

add $t2, $t0, $t1
add $t4, $t2, $t3

Here, $t2 is a involved in the classic RAW hazard between these two instructions.

Most instructional coursework doesn't go further into RAW hazards than this one.

There's also forwarding from MEM to EX for load ALU use:

lw $t2, 0($t1)
add $t4, $t2, $t3

Here, $t2 is involved in a load/use, which requires both a stall cycle and a forward.

In another example of MEM to EX, we separate by one instruction (unrelated to the hazard), so:

lw $t2, 0($t1)
add $t5, $t6, $t7
add $t4, $t2, $t3

Here, $t2 is in a load/use hazard, but only forwarding is required, not a stall cycle.

There's also forwarding from MEM to MEM for load/store sequence:

lw $t2, 0($t1)
sw $t2, 0($t3)

Here, $t2 is involved in a hazard but since the value of $t2 isn't needed until MEM, it can be forwarded directly from lw's MEM stage to the next cycle's MEM stage, which is the sw.

NB: While lw has a source and a target register, sw has two source registers (and no target register), but the values of each of these sources are needed at different times in the execution of the sw. In the above example, $t3 is needed in EX whereas $t2 one stage later, in MEM. (So, this MEM to MEM forward only applies to the store value register of the sw, not the base address register.)

There is also forwarding from WB to MEM, which is when there is separation (an unrelated instruction) between the Write instruction and Read instruction of a load/store RAW hazard.

As @Peter points out with add followed by store there can be latitude of which stages forward, so, designer's choice. And also that the hazards with branching are complicated.