I have the following program to multiple two numbers:
.globl main
main:
# Store the two numbers temporarily in ebx, ecx
mov $7, %ebx
mov $14, %ecx
# clear out eax and add ebx (7) to it ecx (14) times
mov $0, %eax
multiply_step:
add %ebx, %eax
dec %ecx
jnz multiply_step
ret
However, if I add in variables for the 14
and 7
for whatever reason, the program takes about a second to run, which seems a bit strange (the above program is instantaneous) --
.globl main
.globl x,y
x: .byte 7
y: .byte 14
main:
mov x, %ebx
mov y, %ecx
mov $0, %eax
multiply_step:
add %ebx, %eax
dec %ecx
jnz multiply_step
ret
Why does this program take longer to run? I am invoking both as:
$ gcc m2.s -o m2 && ./m2; echo $?
# 98
The variable x is a byte, but you are moving 4 bytes into ebx, so ebx doesn’t have the value 7. The actual value loaded into ebx is 0x1d8b0e07. Similarly, the value in ecx is something like 0x011d8b0e, so you can see why your loop takes much longer than when it is 0x0e.
Despite this error, the low byte of the result is the same.
To load these byte values into 32-bit registers, use:
movzbl x, %ebx
movzbl y, %ecx
This instruction reads a byte from memory, zero-extends it to 32 bits, and puts the result in the destination register.
Or in 64-bit code like you've been using in your other questions, RIP-relative addressing is more efficient and will work in modern PIE executables:
movzbl x(%rip), %ebx
movzbl y(%rip), %ecx