Search code examples
assemblyx86masmtwos-complement

Default storage format for negative integers


Are binary values stored in 2's complement form by default ?
I experimented with the code mov al, -1 and saw that EAX = 000000FF
Is this by default or we could specify to use one's complement or other format


Solution

  • x86 hardware uses 2's complement signed integers e.g. for instructions like movsx sign-extension, imul/idiv signed multiply/divide, and FLAGS setting (specifically OF) for add etc, including sub/cmp and branch conditions like jle less-or-equal. And there are no one's complement math instructions (except for not, one's complement negation, vs. neg two's complement negation aka binary 0 - x.)

    See also Understanding Carry vs. Overflow conditions/flags which describes exactly how 2's complement overflow (OF) vs. carry-out (CF) work for addition.

    Assemblers always1 use 2's complement when encoding negative numbers in the source into machine code. mov-immediate in machine code just copies bit-pattern into the register; all "interpretation" is already done before the CPU sees it. (The only case of mov reg, sign_extended_narrow_immediate is x86-64 mov r/m64, imm32 in 64-bit mode only.) Also note that mov al, -1 doesn't affect the upper bits of EAX. If you saw 0x000000FF, that's because EAX's upper bytes happened to already be zero.

    Footnote 1: You could of course write an x86 assembler that was really weird and did something else. It would be unlikely anyone would would want to use it, though, because it would mean that add eax, -2 didn't decrease EAX's value by 2. Existing mainstream assemblers use the same number format as the hardware, and the hardware is hard-wired for 2's complement, not switchable.

    old_timer points out that some assemblers (e.g. simple ones for simple microcontrollers) might not even support syntax for negative constants at all, in which case you'd always have to manually encode constants into hex or whatever. Things like 0xFF or $FF or whatever syntax.


    If you want to use 1's complement bit-patterns, encode them manually into hex. e.g. mov al, 0FDh (~2) instead of mov al, 0FEh or -2.

    And of course you'd have to implement 1's complement math using multiple instructions. add does binary addition, which is the same operation as 2's complement signed addition, but not the same operation as 1's complement. (That's a major reason why computers use 2's complement: +/- are the same operation as unsigned, and so is the low half of a multiply.)

    Note that x86 machine code has some forms of instructions like add r/m32, sign_extended_imm8 which involves 2's complement sign extension in decoding. i.e. the upper 24 bits are copies of bit #7, replicating the top bit of the immediate to fill the register. Many 1's complement values are compatible with this, e.g. add eax, 0FFFFFFFDh can be encoded as an imm8, and assemblers will do that for you.