assembly cpu-architecture twos-complement integer-arithmetic

What are widening integer operations?

Let's examine Add and Multiply as examples. Which one can be categorized as widening?

Assume inputs are signed characters (i.e 8 bits in length) in 2's complement, unless declared otherwise.

Positive number addition

1+2 = 3

This operation doesn't seem to be widening. 1, 2 and 3 all fit in a character.

However, 250 + 6 overflows an unsigned char. So is this widening?

Same could be done with signed type, 125 + 5 overflows signed char into the sign bit. Is this widening?

Negative number addition

-2-3 = -5

This overflows a character in binary by 1 bit:

   1 1 1 1 1 1 1 0
+  1 1 1 1 1 1 0 1
------------------ 
 1 1 1 1 1 1 0 1 1

The overflow is normally discarded, however, is this considered a widening operation?

Positive number multiplication

1 * 2 = 2

Are all multiplications widening, even if the result still fits in the original data type?

The above example, 2 still fits in an 8-bit character however, if I do the math by hand in binary, extra 0s get appended to left of the result which get discarded.

Negative number multiplication

-2 * -3 = 6

For simplicity, let's assume our inputs are 4 bits. This overflows by twice its size as expected, i.e the result becomes 8 bits:

1110 * 1101 = 10110110

The upper 4 bits are discarded and the lower 4 bits equal 6 as expected.

Is this considered a widening operation?

Division

I haven't gone into detail about division but I assume it's considered a narrowing operation.

Solution

Widening or not is also a property of an assembly instruction that does a math operation. The choice of what instructions to provide is informed by the inherent properties of the math operations, though, e.g. that multiplication can produce wider results.

Mathematically, the sum or difference of two n-bit numbers needs at most n+1 bits to represent exactly. e.g. unsigned 3-bit or two's complement 4-bit 7+7 = 14, i.e. 0b1110 unsigned or 0b01110 two's complement.

Most CPUs only provide a non-widening implementation of add/sub, with the extra bit going into the carry flag if there is one. (Some ISAs like MIPS don't have FLAGS at all, so if you want carry-out you have to recover it from carry = (a+b) < a (unsigned compare)).

Mathematically, the product of two n-bit numbers takes up to 2n bits to represent exactly. e.g. signed 8-bit -128 * -128 = 16384 SCHAR_MIN squared (i.e. 0x80 * 0x80 = 0x4000, note the sign bit being 0 in the result, but the bit right below that being set). Or the unsigned 8-bit 255 * 255 = 65025 i.e. 0xff ^ 2 = 0xfe01.

Many CPUs do provide a widening form of multiplication, e.g. x86's mul r/m32 and imul r/m32 that write the product to EDX:EAX.

x86 also has non-widening imul r, r/m32 in case you don't care about the high half, e.g. implementing C semantics for unsigned * unsigned which produces an unsigned result, truncating the full result. (Note that the low half is the same regardless of interpreting the inputs as signed or unsigned, only the high half differs. That's why x86 doesn't bother to provide a similar mul r, r/m32 instruction)

Some ISAs, instead of providing one widening multiply, provide separate low-half and signed-high-half / unsigned-high-half instructions. (Running them back-to-back may let some implementations fuse them into a single widening multiply internally, but simpler implementations could just run them separately, and software may just use one or the other.)

Some ISAs may not even provide a way to get the high half at all, e.g. ARM Cortex-M0 microcontrollers only have 32b x 32b => 32b multiply, no umull. Using that as a building block for wider multiplies (like 64x64 => 128-bit) may mean breaking down the inputs into 16-bit chunks to apply the usual extended-precision techniques in terms of 16b x 16b => 32b multiplies.

Mathematically, integer division can produce a quotient at most the width of the dividend. For extended precision purposes, it's useful to support a wider dividend (see Why should EDX be 0 before using the DIV instruction?), but this capability isn't normally used by compilers. (Especially not on ISAs like x86 where 64b / 32b => 32b division will fault if the quotient doesn't fit in 32 bits.)

But yes, if an ISA has a division instruction with a wide dividend, like x86's implicit EDX:EAX for div and idiv, that's called narrowing division.