I wrote this assembly program:
.section .data
1: .asciz "Hello"
.section .text
entry:
mov $0x07C0, %ax
add $0x120, %ax
mov %ax, %ss
mov 0x100, %sp
mov $0x7C0, %ax
mov %ax, %ds
# mov $1b, %si
mov $0xE, %ah
mov $0x0, %si
mov $0x0, %bx
push %bp
mov %sp, %bp
mov %di, -20(%bp)
mov %si, -32(%bp)
movl $0x0, -4(%ebp)
.loopcond:
cmpl $127, -4(%ebp)
jge .halt
.print:
lodsb
int $0x10
add $0x1, -4(%ebp)
jmp .loopcond
.halt:
jmp .halt
The first instruction in the .loopcond
section compares the variable to 127 (acts like a for loop that iterates 127 times). This works fine and runs the code 127 times before jumping to .halt
. When I increase the value to be compared however (e.g. to 128), the code seems to jump to .halt
immediately. I don't understand why this is happening. Is it something about signed integers comparison?
I looked at the objdump, once with 127 and 128:
// 127:
00000037 <.loopcond>:
37: 83 7d fc 7f cmpl $0x7f,-0x4(%ebp)
3b: 7d 09 jge 46 <.halt>
// 128:
00000037 <.loopcond>:
37: 81 7d fc 80 00 00 00 cmpl $0x80,-0x4(%ebp)
3e: 7d 09 jge 49 <.halt>
I noticed that the operand of the cmpl
instruction is 4 bytes long in the 128 example, while it's only 1 byte in the 127 example. I suspect that something about that is the cause of this error.
You're telling GAS to assembler for 32-bit mode, but then running that machine code with the CPU in 16-bit mode, so things decode wrong.
An earlier guess was that your problem might be related to add $0x1, -4(%ebp)
which uses an ambiguous operand-size. If GAS picks byte operand-size, that might cause a problem? Although if the upper bytes are zero, it would just be zero-extending into the dword. The cause of your problem is not obvious, but it's weird that you're mixing 16 and 32-bit address size for BP and EBP.
(Update: Instructions other than mov
with ambiguous operand-size default to dword in GAS, at least for 32 or 64-bit mode. For 16-bit mode, it defaults to word size, i.e. the non-byte opcode without using a 66
operand-size prefix. For mov
it's an error. Recent GAS versions warn, but still do the default. Better assemblers like NASM treat it as an error.)
Seriously, just put a number in a register and loop with dec reg
/ jnz
like a normal person.
Or use a debugger to look at memory and sort out what's going on. Your cmpl $127, -4(%ebp)
does specify an operand-size so it's definitely doing a dword compare, not treating 128
as -128
with 8-bit 2's complement.
I noticed that the operand of the cmpl instruction is 4 bytes long in the 128 example, while it's only 1 byte in the 127 example. I suspect that something about that is the cause of this error.
That's not an error. Most basic x86 integer ALU instructions have an opcode for a version with an 32-bit immediate, and another with a sign-extended 8-bit immediate.
On original 8086 this saved 1 byte for instructions like cmp r/m16, imm8
vs. cmp r/m16, imm16
. In 32/64-bit code, this saves 3 bytes for imm8 vs. imm32. https://www.felixcloutier.com/x86/cmp lists the forms available.
The cutoff point is of course -128 .. +127 because it's a sign-extended immediate. Your assembler always chooses the smallest encoding possible for a given asm source line, so everything is working as intended.
If you're assembling for 32-bit mode but running as 16-bit mode, cmpl $imm32, r/m32
will break in a different way than the rest of your code.
The other instructions are all the same length regardless of mode, but run with the opposite operand-size (16 vs. 32). But the opcode for cmpl
and cmpw
is the same; the difference is only the operand-size (toggled to the non-default-for-the-mode value a 66
prefix).
So when your cmpl
assembled for 32-bit decodes in 16-bit mode, there are 2 bytes of immediate left over. Those bytes are 00 00
, which is a memory-destination add [something], al
(I forget which registers that 00
modrm encodes in a 16-bit addressing mode.) This will clobber flags from the cmp
.
Use .code16
or a command-line option to make 16-bit machine code.