assemblymipsmars-simulator

# Exact difference between mul and mulu

I have found from a source that contents of Rs and Rt are considered signed integers for mul.

Whereas, for mulu, contents of Rs and Rt are considered unsigned integers.

But every time I execute mul and mulu, they seem to give the same result.

``````li \$t0 -2
li \$t1 2

mul \$s0, \$t0, \$t1
mulu \$s1, \$t0, \$t1
``````

Both stores -4 in \$s0 and \$s1. My question is what is it mean by saying Rs and Rt are considered signed/unsigned integers and how mul and mulu are treating Rt and Rs differently? What is the specific case for which I can see mul and mulu giving different result? I am using MARS simulator. Thank you. Please ask me if you have any confusion understanding the question.

Solution

• MIPS has a multiplication unit that gives exact answers: it takes 32 bits × 32 bits → 64 bits, which is how multiplication works, mathematically speaking.

On MARS, `mul` is a real instruction.  It produce a 64-bit result captured in `hi` & `lo`, and also takes the low 32 bits of that and stores it into `rd`.  (This instruction was not part of the first MIPS; it was added later.)  If you execute this instruction, you'll see `hi` and `lo` affected in MARS as well as `rd`.

Whereas on MARS `mulu` is a pseudo instruction.  It also produces a 64-bit result captured in `hi` & `lo`, and takes the low 32 bits of that and stores it into "`rd`" (which I put in quotes here b/c this `mulu` is not a real instruction so does not have actual register fields), but is implemented by the MARS assembler as 2 real instructions as this: first `multu rs,rt`, then `mflo rd`.  If you look at the machine code for this instruction in MARS, you'll see this expansion into 2 instructions, and if you execute you'll first see `hi` and `lo` and then the `rd` being affected.

Older MIPS offered only 32 × 32 → 64 results, via the two operand `mult` & `multu`, with the 64-bit results captured in special `hi` & `lo` register (this so that multiplication could take multiple cycles and not interfere with the integer register file that could be servicing other instructions running in parallel) so you might consider experimenting with those, e.g. `mult \$t0, \$t1`, and `multu \$t0, \$t1`.

The answer to 2 × -2 done in signed (e.g. using `mult`) is 0xffffffff 0xfffffffc, and this value can safely be truncated to 32 bits (or less), because this value is simply -4, which fits in the smaller number of bits1.

There is no -2 in unsigned so those bits are interpreted as a large positive number, fffffffe16 aka 4,294,967,29410.

The answer to 2 × 0xfffffffe done in unsigned (using `multu`) is 0x00000001 0xfffffffc, and this value — which is positive as you'd expect for unsigned × unsigned — does not fit in 32 bits2, so when forcefully truncated without checking as you're doing, we have overflow, which is to say: a wrong answer.

Happens that the bit pattern for the low 32 is the same as for signed & unsigned, but this is relatively meaningless due to the overflow — of course, it is useful for the hardware since this fact means the two multiplication types share much circuitry.

1  How can we tell this 64-bit value fits in 32 bits?  The upper result is either all 0's or all 1's and those bits also match the sign bit (top bit, MSB) of the lower result.

How would we make a runtime test for this on MIPS?  Take the lower 32 bit result and shift it arithmetically to the right by 31 positions (leaving only the sign bit in the LSB position).  Using an arithmetic shift replicates the sign bit as it shifts right, so we will obtain a value that is either all 0's or all 1's according to the original sign.  Then compare that shifted value with the upper 32 bits, and if equal, the 64-bit value can be represented in 32 bits, and if not equal then won't fit in 32 bits.

2  How do we know this 64-bit number won't fit in 32 bits?  Since all the bits of an unsigned data type are magnitude bits (i.e. no sign bit), then if the upper 32 bits of the result is non-zero, then that number needs more than 32 bits to represent, and keeping only the low 32 bits will truncate the result as if doing modulo 232.