Search code examples
assemblyx86calling-convention

How to preserve the register I touch?


This assignment is "Create an HLA Assembly language program that prompts for three integers from the user. Create and call a function that returns in DX the value of the parameter that is the smallest of the three. In order to receive full credit, after returning back to the caller, your function should not change the value of any register other than DX"

My professor said "My Code Is Not Preserving The Registers It Touches." I do not know how to do it. Could anyone help? Is it about push and pop?

program Smallest_Number;
#include( "stdlib.hhf" );


static
value1 : int16;
value2 : int16;
value3 : int16;

procedure smallest( var value1 : int16; var value2 : int16; var value3 : int16 ); @nodisplay; @noframe; 

static
dReturnAddress : dword;

begin smallest;

pop( dReturnAddress ); 
pop( AX ); 
pop( BX );
pop( CX ); 
push( dReturnAddress );

mov(AX, DX);  
cmp(DX, BX);    
jl check_1;        
mov(BX, DX);   

check_1:
cmp (DX, CX);   
jl end_smallest;   
jmp update_1;     


update_1:
mov (CX, DX);   
jmp end_smallest; 

end_smallest:
stdout.put( "The smallest value is " );
stdout.puti16( DX );

ret( ) ;
end smallest;



begin Smallest_Number;


stdout.put( "Provide value1:" );
stdin.get(value1);
stdout.put( "Provide value2:" );
stdin.get(value2);
stdout.put( "Provide value3:" );
stdin.get(value3);

mov(value1,AX);
mov(value2,BX);
mov(value3,CX);


push(CX);
push(BX);
push(AX);

call smallest;

end Smallest_Number;

I have no idea how to do it


Solution

  • My professor said "My Code Is Not Preserving The Registers It Touches." I do not know how to do it. Could anyone help? Is it about push and pop?

    Yes! Given this is an assignment, I won't make changes to your code, but I believe that's what your teacher means by "preserving registers it touches."

    Something that might benefit you to read on are calling conventions. I believe ABI's have that mostly standardized nowadays, but there's still plenty around to choose from.

    Doing my best to help you find your way, consider the following C code:

    #include <stdio.h>
    
    int a = 0;
    
    void my_cool_function (int b) {
      a += b + 1;
    }
    
    int main () {
      int a;
      a = 0;
      printf("%d", a);
    
    
      int b;
      b = 1;
      my_cool_function(b);
    
      printf("%d", a);
    }
    

    As you may already know, the a variable inside main is completely different from the a global variable due to scoping rules. An interesting question would be: could we make it behave the same if these were registers? That is, if a in the program above wasn't an abstract container, but rather the name of a register.

    Were that the case, the program could behave differently, right? But we can definitely avoid that case somewhat similarly to what is done to variables with the same name. I'll use this uncool sum as a simpler case-study:

    int my_uncool_sum (int b, int c) {
      return b + c + 1;
    }
    

    I've compiled it on my arm machine with zero optimizations so the compiler doesn't get in the way. Here is the disassembly dump that I got:

    [0x0] <+0>:   sub    sp, sp, #0x10
    [0x4] <+4>:   str    w0, [sp, #0xc]
    [0x8] <+8>:   str    w1, [sp, #0x8]
    [0xc] <+12>:  ldr    w8, [sp, #0xc]
    [0x10] <+16>: ldr    w9, [sp, #0x8]
    [0x14] <+20>: add    w8, w8, w9
    [0x18] <+24>: add    w0, w8, #0x1
    [0x1c] <+28>: add    sp, sp, #0x10
    [0x20] <+32>: ret
    

    In this architecture, ints are 4 bytes long and the destination register is always the instruction's first argument. Knowing that, there's a symmetry between the first instruction sub sp, sp, 0x10 and the penultimate instruction add sp, sp, 0x10, right? The register being used there is the Stack Pointer register and that value, 0x10 is subtracting and adding just enough bytes to fit two integers (for this architecture, 8 bytes)1. Remember: the stack grows from high address to low address.

    Similar to your code, this function is also executing a kind of pop instruction to get its arguments. For you, that was, e.g., pop( AX );. That pop instruction gets a value from the stack and puts it into the register AX. What was there before? Anyway, let's keep going.

    ARM doesn't allow you to do direct addressing, so a little dance is necessary. First, we get the address of what we want: str w0, [sp, #0xc]. w0 know holds the address of our first parameter.

    Then, we get the actual value that memory location points to: ldr w8, [sp, #0xc] (the compiler chose not to use w0 for some reason). w8 now holds the value of our first parameter. The same process happens for the pair of register w1 and w9 for the second argument. We sum them together and increment the Stack Pointer before returning.

    One last thing is missing: what does it look like to call this function? The next snippet is the entire module with definition and a call to the function my_uncool_sum:

    int my_uncool_sum (int b, int c) {
      return b + c + 1;
    }
    
    int uncool_sum_usage() {
      volatile register int busy = 0xFFF;
      return busy + my_uncool_sum(1, 2);
    }
    

    Note that I'm being annoying on uncool_sum_usage and asking 4095,0xFFF in hex, to be kept around in a register. Here is the disassembly of uncool_sum_usage:

    [0x24] <+0>:  sub    sp, sp, #0x20
    [0x28] <+4>:  stp    x29, x30, [sp, #0x10]
    [0x2c] <+8>:  add    x29, sp, #0x10
    [0x30] <+12>: mov    w8, #0xfff
    [0x34] <+16>: stur   w8, [x29, #-0x4]
    [0x38] <+20>: ldur   w8, [x29, #-0x4]
    [0x3c] <+24>: str    w8, [sp, #0x8]
    [0x40] <+28>: mov    w0, #0x1
    [0x44] <+32>: mov    w1, #0x2
    [0x48] <+36>: bl     0x48                      ; <+36> at main.c:7:17
    [0x4c] <+40>: ldr    w8, [sp, #0x8]
    [0x50] <+44>: add    w0, w8, w0
    [0x54] <+48>: ldp    x29, x30, [sp, #0x10]
    [0x58] <+52>: add    sp, sp, #0x20
    [0x5c] <+56>: ret
    

    As the fourth instruction tells us, 0xfff was right there on w8. It is followed a stur (store register) instruction. That stur is using x29 - 0x4. x29 is another general-purpose register and 0x4, once again, is just enough space for an int2. Note that x29 was recently seen with the Stack Pointer on the third instruction add x29, sp, #0x103. After calling the function, we need 0xFFF back in order to return, and we get it using the Stack Pointer yet again ldr w8, [sp, #0x8].

    0xFFF was saved and restored even though we made a function call that could've altered our registers. Calling conventions are crucial to have this behavior be consistent. It might help you out to wonder how is the stack significant for function calling? How was it used during this example and what's the relation that it has with calling conventions? How can we make sure we're retaining previous information when jumping to code that might mess however it wants with registers?

    Last note: in case you wish to replicate in another architecture (maybe using a different convention?), code was compiled with -stc=c99 -O0 -g. lldb was used to get the disassembly. If the object file is named a.out (the default), you can get the disassembly by running lldb a.out and send disassemble -n uncool_sum_usage to get the disassembly for the function.

    1 In this architecture, addresses are 64 bit wide. Note that the registers in use are prefixed with w, e.g., W0. This means they are being using in a 32 bit wide mode, so the top 32 bits are being ignored. Because we're using the Stack Pointer to reference addresses rather than values, we need enough space for two addresses. A 64 bit address is the same as 8 bytes. One address for each means we need 16 bytes, or, 0x10 in hex.

    2 This time, we're talking about the size of the value rather than the address. The stur instruction here is using the value contained within the busy variable, namely, 0xFFF.

    3 Not completely related to the problem at hand, but I figured it would be useful to showcase what's happening here as well. Without giving out too much, the second instruction can be seen as a store for x29 and x30. The following instruction, add x29, sp, 0x10 is similar to the one we've seen before: x29 is holding the address SP + 10 (higher address than SP). The sequence starting with mov w8, #0xfff and ending with ldur w8, [x29, #-0x4] is probably a consequence of disabling optimization. The end result is that w8 will be holing the value 0xFFF. It only matters for the next instruction, str w8, [sp, #0x8], which will keep that value safe until we restore it again with w8, [sp, #0x8]