Search code examples
assemblyx86x86-16masmmicro-optimization

Is there some benefit in the following assembly commands?


In our system's programming classes, we're being taught assembly language. In most of the sample programs our prof. has shown in classes; he's using:

XOR CX, CX

instead of

MOV CX, 0

or

OR AX, AX
JNE SOME_LABEL

instead of

CMP AX, 0
JNE SOME_LABEL

or

AND AL, 0FH        ; To convert input ASCII value to numeral
; The value in AL has already been checked to lie b/w '0' and '9'

instead of

SUB AL, '0'

My question is the following, is there some kind of better performance when using the AND/OR or XOR instead of the alternate (easy to understand/read) method?

Since these programs are generally shown to us during theory lecture hours, most of the class is unable to actually evaluate them verbally. Why spend 40 minutes of lecture explaining these trivial statements?


Solution

  • XOR CX, CX  ;0x31 0xC9
    

    Uses only two bytes: opcode 0x31 and ModR/M byte that stores source and destination register (in this case these two are same).

    MOV CX, 0  ;0xB8 0x08 0x00 0x00
    

    Needs more bytes: opcode 0xB8, ModR/M for destination (in this case CX) and two byte immediate filled with zeroes. There is no difference from clocking perspective (both take only one clock), but mov needs 4 bytes while xor uses only two.

    OR AX, AX  ;0x0A 0xC0
    

    again uses only opcode byte and ModRM byte, while

    CMP AX, 0  ;0x3D 0x00 0x00 <-- but usually 0x3B ModRM 0x00 0x00
    

    uses three or four bytes. In this case it uses three bytes (opcode 0x3D, word immediate representing zero) because x86 has special opcodes for some operations with Accumulator register, but normally it would use four bytes (opcode, ModR/M, word immediate). It's again the same when talking about CPU clocks.

    There's no difference to processor when executing

    AND AL, 0x0F  ;0x24 0x0F  <-- again special opcode for Accumulator
    

    and

    SUB AL, '0'  ;0x2D 0x30 0x00  <-- again special opcode for Accumulator
    

    (only one byte difference), but when you substract ASCII zero, you can't be sure that there won't remain value greater than 9 in Accumulator. Also anding sets OF and CF to zero, while sub sets them according to the result ANDing can be safer, but my personal opinion is that this usage depends on context.