C Pointer to EFLAGS using NASM

For a task at my school, I need to write a C Program which does a 16 bit addition using an assembly programm. Besides the result the EFLAGS shall also be returned.

Here is my C Program:

int add(int a, int b, unsigned short* flags);//function must be used this way
int main(void){
    unsigned short* flags = NULL;
    printf("%d\n",add(30000, 36000, flags);// printing just the result
    return 0;
}

For the moment the Program just prints out the result and not the flags because I am unable to get them.

Here is the assembly program for which I use NASM:

global add
section .text

add:
    push ebp    
    mov ebp,esp
    mov ax,[ebp+8]
    mov bx,[ebp+12]
    add ax,bx
    mov esp,ebp
    pop ebp
    ret

Now this all works smoothly. But I have no idea how to get the pointer which must be at [ebp+16] pointing to the flag register. The professor said we will have to use the pushfd command.

My problem just lies in the assembly code. I will modify the C Program to give out the flags after I get the solution for the flags.

Solution

Normally you'd just use a debugger to look at flags, instead of writing all the code to get them into a C variable for a debug-print. Especially since decent debuggers decode the condition flags symbolically for you, instead of or as well as showing a hex value.

You don't have to know or care which bit in FLAGS is CF and which is ZF. (This information isn't relevant for writing real programs, either. I don't have it memorized, I just know which flags are tested by different conditions like jae or jl. Of course, it's good to understand that FLAGS are just data that you can copy around, save/restore, or even modify if you want)

Your function args and return value are int, which is 32-bit in the System V 32-bit x86 ABI you're using. (links to ABI docs in the x86 tag wiki). Writing a function that only looks at the low 16 bits of its input, and leaves high garbage in the high 16 bits of the output is a bug. The int return value in the prototype tells the compiler that all 32 bits of EAX are part of the return value.

As Michael points out, you seem to be saying that your assignment requires using a 16-bit ADD. That will produce carry, overflow, and other conditions with different inputs than if you looked at the full 32 bits. (BTW, this article explains carry vs. overflow very well.)

Here's what I'd do. Note the 32-bit operand size for the ADD.

global add
section .text

add:
    push  ebp    
    mov   ebp,esp      ; stack frames are optional, you can address things relative to ESP

    mov   eax, [ebp+8]    ; first arg: No need to avoid loading the full 32 bits; the next insn doesn't care about the high garbage.
    add   ax,  [ebp+12]   ; low 16 bits of second arg.  Operand-size implied by AX

    cwde                  ; sign-extend AX into EAX

    mov   ecx, [ebp+16]   ; the pointer arg
    pushf                 ; the simple straightforward way
    pop   edx
    mov   [ecx], dx       ; Store the low 16 of what we popped.  Writing  word [ecx]  is optional, because dx implies 16-bit operand-size
                          ; be careful not to do a 32-bit store here, because that would write outside the caller's object.

    ;  mov esp,ebp   ; redundant: ESP is still pointing at the place we pushed EBP, since the push is balanced by an equal-size pop
    pop ebp
    ret

CWDE (the 16->32 form of the 8086 CBW instruction) is not to be confused with CWD (the AX -> DX:AX 8086 instruction). If you're not using AX, then MOVSX / MOVZX are a good way to do this.

The fun way: instead of using the default operand size and doing 32-bit push and pop, we can do a 16-bit pop directly into the destination memory address. That would leave the stack unbalanced, so we could either uncomment the mov esp, ebp again, or use a 16-bit pushf (with an operand-size prefix, which according to the docs makes it only push the low 16 FLAGS, not the 32-bit EFLAGS.)

; What I'd *really* do: maximum efficiency if I had to use the 32-bit ABI with args on the stack, instead of args in registers
global add
section .text

add:
    mov   eax, [esp+4]    ; first arg, first thing above the return address
    add   ax,  [esp+8]    ; second arg
    cwde                  ; sign-extend AX into EAX

    mov   ecx, [esp+12]   ; the pointer

    pushfw                     ; push the low 16 of FLAGS
    pop     word [ecx]         ; pop into memory pointed to by unsigned short* flags

    ret

Both PUSHFW and POP WORD will assemble with an operand-size prefix. output from objdump -Mintel, which uses slightly different syntax from NASM:

  4000c0:       66 9c                   pushfw 
  4000c2:       66 8f 01                pop    WORD PTR [ecx]

PUSHFW is the same as o16 PUSHF. In NASM, o16 applies the operand-size prefix.

If you only needed the low 8 flags (not including OF), you could use LAHF to load FLAGS into AH and store that.

PUSHFing directly into the destination is not something I'd recommend. Temporarily pointing the stack at some random address is not safe in general. Programs with signal handlers will use the space below the stack asynchronously. This is why you have to reserve stack space before using it with sub esp, 32 or whatever, even if you're not going to make a function call that would overwrite it by pushing more stuff on the stack. The only exception is when you have a red-zone.

You C caller:

You're passing a NULL pointer, so of course your asm function segfaults. Pass the address of a local to give the function somewhere to store to.

int add(int a, int b, unsigned short* flags);

int main(void) {
    unsigned short flags;
    int result = add(30000, 36000, &flags);
    printf("%d %#hx\n", result, flags);
    return 0;
}