Exact difference between mul and mulu

I have found from a source that contents of Rs and Rt are considered signed integers for mul.

Whereas, for mulu, contents of Rs and Rt are considered unsigned integers.

But every time I execute mul and mulu, they seem to give the same result.

li $t0 -2
li $t1 2

mul $s0, $t0, $t1
mulu $s1, $t0, $t1

Both stores -4 in $s0 and $s1. My question is what is it mean by saying Rs and Rt are considered signed/unsigned integers and how mul and mulu are treating Rt and Rs differently? What is the specific case for which I can see mul and mulu giving different result? I am using MARS simulator. Thank you. Please ask me if you have any confusion understanding the question.

Solution

MIPS has a multiplication unit that gives exact answers: it takes 32 bits × 32 bits → 64 bits, which is how multiplication works, mathematically speaking.

On MARS, mul is a real instruction. It produce a 64-bit result captured in hi & lo, and also takes the low 32 bits of that and stores it into rd. (This instruction was not part of the first MIPS; it was added later.) If you execute this instruction, you'll see hi and lo affected in MARS as well as rd.

Whereas on MARS mulu is a pseudo instruction. It also produces a 64-bit result captured in hi & lo, and takes the low 32 bits of that and stores it into "rd" (which I put in quotes here b/c this mulu is not a real instruction so does not have actual register fields), but is implemented by the MARS assembler as 2 real instructions as this: first multu rs,rt, then mflo rd. If you look at the machine code for this instruction in MARS, you'll see this expansion into 2 instructions, and if you execute you'll first see hi and lo and then the rd being affected.

Older MIPS offered only 32 × 32 → 64 results, via the two operand mult & multu, with the 64-bit results captured in special hi & lo register (this so that multiplication could take multiple cycles and not interfere with the integer register file that could be servicing other instructions running in parallel) so you might consider experimenting with those, e.g. mult $t0, $t1, and multu $t0, $t1.

The answer to 2 × -2 done in signed (e.g. using mult) is 0xffffffff 0xfffffffc, and this value can safely be truncated to 32 bits (or less), because this value is simply -4, which fits in the smaller number of bits¹.

There is no -2 in unsigned so those bits are interpreted as a large positive number, fffffffe₁₆ aka 4,294,967,294₁₀.

The answer to 2 × 0xfffffffe done in unsigned (using multu) is 0x00000001 0xfffffffc, and this value — which is positive as you'd expect for unsigned × unsigned — does not fit in 32 bits², so when forcefully truncated without checking as you're doing, we have overflow, which is to say: a wrong answer.

Happens that the bit pattern for the low 32 is the same as for signed & unsigned, but this is relatively meaningless due to the overflow — of course, it is useful for the hardware since this fact means the two multiplication types share much circuitry.

¹ How can we tell this 64-bit value fits in 32 bits? The upper result is either all 0's or all 1's and those bits also match the sign bit (top bit, MSB) of the lower result.

How would we make a runtime test for this on MIPS? Take the lower 32 bit result and shift it arithmetically to the right by 31 positions (leaving only the sign bit in the LSB position). Using an arithmetic shift replicates the sign bit as it shifts right, so we will obtain a value that is either all 0's or all 1's according to the original sign. Then compare that shifted value with the upper 32 bits, and if equal, the 64-bit value can be represented in 32 bits, and if not equal then won't fit in 32 bits.

² How do we know this 64-bit number won't fit in 32 bits? Since all the bits of an unsigned data type are magnitude bits (i.e. no sign bit), then if the upper 32 bits of the result is non-zero, then that number needs more than 32 bits to represent, and keeping only the low 32 bits will truncate the result as if doing modulo 2³².