assemblymipsmars-simulator# Exact difference between mul and mulu

I have found from a source that contents of Rs and Rt are considered signed integers for mul.

Whereas, for mulu, contents of Rs and Rt are considered unsigned integers.

But every time I execute mul and mulu, they seem to give the same result.

```
li $t0 -2
li $t1 2
mul $s0, $t0, $t1
mulu $s1, $t0, $t1
```

Both stores -4 in $s0 and $s1. My question is what is it mean by saying Rs and Rt are considered signed/unsigned integers and how mul and mulu are treating Rt and Rs differently? What is the specific case for which I can see mul and mulu giving different result? I am using MARS simulator. Thank you. Please ask me if you have any confusion understanding the question.

Solution

MIPS has a multiplication unit that gives exact answers: it takes 32 bits × 32 bits → 64 bits, which is how multiplication works, mathematically speaking.

On MARS, `mul`

is a real instruction. It produce a 64-bit result captured in `hi`

& `lo`

, and also takes the low 32 bits of that and stores it into `rd`

. (This instruction was not part of the first MIPS; it was added later.) If you execute this instruction, you'll see `hi`

and `lo`

affected in MARS as well as `rd`

.

Whereas on MARS `mulu`

is a pseudo instruction. It also produces a 64-bit result captured in `hi`

& `lo`

, and takes the low 32 bits of that and stores it into "`rd`

" (which I put in quotes here b/c this `mulu`

is not a real instruction so does not have actual register fields), but is implemented by the MARS assembler as 2 real instructions as this: first `multu rs,rt`

, then `mflo rd`

. If you look at the machine code for this instruction in MARS, you'll see this expansion into 2 instructions, and if you execute you'll first see `hi`

and `lo`

and then the `rd`

being affected.

Older MIPS offered only 32 × 32 → 64 results, via the two operand `mult`

& `multu`

, with the 64-bit results captured in special `hi`

& `lo`

register (this so that multiplication could take multiple cycles and not interfere with the integer register file that could be servicing other instructions running in parallel) so you might consider experimenting with those, e.g. `mult $t0, $t1`

, and `multu $t0, $t1`

.

The answer to 2 × -2 done in signed (e.g. using `mult`

) is 0xffffffff 0xfffffffc, and this value can safely be truncated to 32 bits (or less), because this value is simply -4, which fits in the smaller number of bits^{1}.

There is no -2 in unsigned so those bits are interpreted as a large positive number, fffffffe_{16} aka 4,294,967,294_{10}.

The answer to 2 × 0xfffffffe done in unsigned (using `multu`

) is 0x00000001 0xfffffffc, and this value — which is positive as you'd expect for unsigned × unsigned — does not fit in 32 bits^{2}, so when forcefully truncated without checking as you're doing, we have overflow, which is to say: a wrong answer.

Happens that the bit pattern for the low 32 is the same as for signed & unsigned, but this is relatively meaningless due to the overflow — of course, it is useful for the hardware since this fact means the two multiplication types share much circuitry.

^{1} How can we tell this 64-bit value fits in 32 bits? The upper result is either all 0's or all 1's and those bits also match the sign bit (top bit, MSB) of the lower result.

How would we make a runtime test for this on MIPS? Take the lower 32 bit result and shift it arithmetically to the right by 31 positions (leaving only the sign bit in the LSB position). Using an arithmetic shift replicates the sign bit as it shifts right, so we will obtain a value that is either all 0's or all 1's according to the original sign. Then compare that shifted value with the upper 32 bits, and if equal, the 64-bit value can be represented in 32 bits, and if not equal then won't fit in 32 bits.

^{2} How do we know this 64-bit number won't fit in 32 bits? Since all the bits of an unsigned data type are magnitude bits (i.e. no sign bit), then if the upper 32 bits of the result is non-zero, then that number needs more than 32 bits to represent, and keeping only the low 32 bits will truncate the result as if doing modulo 2^{32}.

- Multiply two 8 bit number that gives 16bit number as result, with 8bit register and only add instruction
- Clamping the results of 1-(x/y) to -1 .. +1 range for positive inputs, and special casing zero
- Socket opening macOS assembly
- GDB outputs arm mnemonics
- Correct way to add labels for RISCV data section (so assembler can pick up)
- Printing an Int (or Int to String)
- Assumptions about dwPageSize on different systems
- BL instruction ARM - How does it work
- How to see the machine code generated for JNI-Calls?
- Difference between word size and clock speed?
- What are the 128-bit to 512-bit registers used for?
- Accessing an array in emu8086
- Why does gcc pass char type in 8 byte format to function assembly
- What does the function insl do in Os Dev's PCI IDE tutorial?
- Initialize serial port with x86 assembly
- Should values always be popped off the x87 FPU stack?
- GDB Continues Execution Unexpectedly After Linked Branch (ARM Assembly)
- Define inline bytes to register in NASM; put db string in .data and get a pointer to it all with one source line?
- Where is the "%d\n" format string stored and how does GCC pass it to printf on x86-64?
- How do I get the parameters passed in to an assembler program running in the UNIX System Services environment on z/OS?
- Decompiling an ARM asm back to C
- How to set boundary for moving something around the screen when user presses keys, in assembly?
- Get value from RDPMC using Go
- Segmentation fault when using DB (define byte) inside a function
- Can't open file when debugging x86 NASM program with GDB
- Why the number of x86 int registers is 8?
- Clarify subsections of 32-bit registers x86
- gcc using `lea` instead of `add`
- Test whether a register is zero with CMP reg,0 vs OR reg,reg?
- Why do we need one jump after changing `PG` with `mov CR0, ...` when using non-completely serializing instruction?