Search code examples
assemblycpu-architecturetwos-complementinteger-arithmetic

How is overflow detected when doing binary subtraction


Assume we have 3 bits to play with. I am going to represent plus and minus 3 in 2's complement:

+3 = 0b011
-3 = 0b101

When doing addition you always have a dangling bit when an overflow happens like this (-3) + (+3):

  1 0 1
+ 0 1 1
  -----
1 0 0 0

But what about subtraction (-3) - (+3)?

  1 0 1
- 0 1 1
  -----
  0 1 0

0b010 is 2 which is not the correct result we expected being -6. There is an overflow and the extra bit is not even generated so how will it get detected that there is an overflow?

I guess a correct way to tackle this is to sign extend the inputs first?


Solution

  • How signed overflow is detected in logic

    (-3) - (+3) = (-3) + (-3) = (-3) + (~3) + 1
    

    Using elementary school math and twos complement

        1
      101
    + 100
    ======
    

    finish it

     1011
      101
    + 100
    ======
      010
    

    SIGNED overflow happens when the carry in and carry out of the msbit are different. Which we see here. Another way to say that is, using the inverted operand B, if the msbits of the operands are the same and the result is not the same value then it is a signed overflow.

    Note that the carry out of the msbit is a not borrow, so a 1 means there was no borrow and a 0 means there was. Some processor architectures invert the carry out and call it a borrow bit, others do not. Either solution works you just have to have your comparison logic match as well as subtract with borrow.

    With grade school math we could just keep adding columns. If we could simply add one more bit to a register we could get complete answers for addition and subtraction. For multiplication we need twice as many bits. For 3 bit operands we need a 6 bit result to not overflow. To complement that you want a 6 bit numerator for a three bit divisor for division, ideally. Not all instruction sets provide this. And yes for signed operations you need to sign extend and for unsigned you need to zero pad, thus the difference between a signed multiply and an unsigned multiply. for 3 bit results (multiply) the signed/unsigned does not matter (do it on paper and pencil like grade school and this should be obvious). Likewise with 3 bit addition or subtraction there is no need to know unsigned vs signed.

    With addition and subtraction though the normal solution is add with carry and subtract with borrow. -3 - +3 gives -6 which we cannot represent with 3 bits, we need 4 but if we assume 3 bit registers then the only thing we can do is 6. Which means we are really trying to do 111101 - 000011.

         1011
       111101
      +111100
        ======
          010
    

    and if we visually do this and cut it in half

         1   011
       111   101
      +111   100
      ============
             010
    

    we can complete the subtraction using a second instruction (subtract with borrow)

      1111 
       111 
      +111 
      =====
       111
       
    

    111010 is -6, no overflow. The key is to get the carry out of the lower half into the carry in of the upper half.

    If the architecture inverts the carry bit on the way out of the subtract then it needs to invert the borrow bit on the way into a subtract with borrow, otherwise if it does not invert it out of sub then don't invert it into sbb. Add and adc (add with carry) work the same way, just nothing is inverted, the carry bit comes into adc as the carry in of the msbit.

    Multiply is not as simple it is not a single bit thing you need a whole registers worth of bits, so some architectures will give you that 3 bits times 3 bits equals 6 bits signed and unsigned multiply instructions. But for addition and subtraction there is no real need, use the carry bit and an additional instruction with the memory/registers that contain the rest of the bits. You can add/subtract as wide as you have memory/registers, 1000 bit plus 1000 bit is easy it is just one add and a bunch of adcs in a row.