For a task at my school, I need to write a C Program which does a 16 bit addition using an assembly programm. Besides the result the EFLAGS shall also be returned.
Here is my C Program:
int add(int a, int b, unsigned short* flags);//function must be used this way
int main(void){
unsigned short* flags = NULL;
printf("%d\n",add(30000, 36000, flags);// printing just the result
return 0;
}
For the moment the Program just prints out the result and not the flags because I am unable to get them.
Here is the assembly program for which I use NASM:
global add
section .text
add:
push ebp
mov ebp,esp
mov ax,[ebp+8]
mov bx,[ebp+12]
add ax,bx
mov esp,ebp
pop ebp
ret
Now this all works smoothly. But I have no idea how to get the pointer which must be at [ebp+16]
pointing to the flag register. The professor said we will have to use the pushfd
command.
My problem just lies in the assembly code. I will modify the C Program to give out the flags after I get the solution for the flags.
Normally you'd just use a debugger to look at flags, instead of writing all the code to get them into a C variable for a debug-print. Especially since decent debuggers decode the condition flags symbolically for you, instead of or as well as showing a hex value.
You don't have to know or care which bit in FLAGS is CF and which is ZF. (This information isn't relevant for writing real programs, either. I don't have it memorized, I just know which flags are tested by different conditions like jae
or jl
. Of course, it's good to understand that FLAGS are just data that you can copy around, save/restore, or even modify if you want)
Your function args and return value are int
, which is 32-bit in the System V 32-bit x86 ABI you're using. (links to ABI docs in the x86 tag wiki). Writing a function that only looks at the low 16 bits of its input, and leaves high garbage in the high 16 bits of the output is a bug. The int
return value in the prototype tells the compiler that all 32 bits of EAX are part of the return value.
As Michael points out, you seem to be saying that your assignment requires using a 16-bit ADD. That will produce carry, overflow, and other conditions with different inputs than if you looked at the full 32 bits. (BTW, this article explains carry vs. overflow very well.)
Here's what I'd do. Note the 32-bit operand size for the ADD.
global add
section .text
add:
push ebp
mov ebp,esp ; stack frames are optional, you can address things relative to ESP
mov eax, [ebp+8] ; first arg: No need to avoid loading the full 32 bits; the next insn doesn't care about the high garbage.
add ax, [ebp+12] ; low 16 bits of second arg. Operand-size implied by AX
cwde ; sign-extend AX into EAX
mov ecx, [ebp+16] ; the pointer arg
pushf ; the simple straightforward way
pop edx
mov [ecx], dx ; Store the low 16 of what we popped. Writing word [ecx] is optional, because dx implies 16-bit operand-size
; be careful not to do a 32-bit store here, because that would write outside the caller's object.
; mov esp,ebp ; redundant: ESP is still pointing at the place we pushed EBP, since the push is balanced by an equal-size pop
pop ebp
ret
CWDE (the 16->32 form of the 8086 CBW instruction) is not to be confused with CWD (the AX -> DX:AX 8086 instruction). If you're not using AX, then MOVSX / MOVZX are a good way to do this.
The fun way: instead of using the default operand size and doing 32-bit push and pop, we can do a 16-bit pop directly into the destination memory address. That would leave the stack unbalanced, so we could either uncomment the mov esp, ebp
again, or use a 16-bit pushf (with an operand-size prefix, which according to the docs makes it only push the low 16 FLAGS, not the 32-bit EFLAGS.)
; What I'd *really* do: maximum efficiency if I had to use the 32-bit ABI with args on the stack, instead of args in registers
global add
section .text
add:
mov eax, [esp+4] ; first arg, first thing above the return address
add ax, [esp+8] ; second arg
cwde ; sign-extend AX into EAX
mov ecx, [esp+12] ; the pointer
pushfw ; push the low 16 of FLAGS
pop word [ecx] ; pop into memory pointed to by unsigned short* flags
ret
Both PUSHFW and POP WORD will assemble with an operand-size prefix. output from objdump -Mintel
, which uses slightly different syntax from NASM:
4000c0: 66 9c pushfw
4000c2: 66 8f 01 pop WORD PTR [ecx]
PUSHFW is the same as o16 PUSHF
. In NASM, o16
applies the operand-size prefix.
If you only needed the low 8 flags (not including OF), you could use LAHF to load FLAGS into AH and store that.
PUSHFing directly into the destination is not something I'd recommend. Temporarily pointing the stack at some random address is not safe in general. Programs with signal handlers will use the space below the stack asynchronously. This is why you have to reserve stack space before using it with sub esp, 32
or whatever, even if you're not going to make a function call that would overwrite it by pushing more stuff on the stack. The only exception is when you have a red-zone.
You're passing a NULL pointer, so of course your asm function segfaults. Pass the address of a local to give the function somewhere to store to.
int add(int a, int b, unsigned short* flags);
int main(void) {
unsigned short flags;
int result = add(30000, 36000, &flags);
printf("%d %#hx\n", result, flags);
return 0;
}