Search code examples
cassemblyarmcalling-convention

Call C function from Assembly, passing args and getting the return value in the ARM calling convention


I want to call a C function, say:

int foo(int a, int b) {return 2;}

inside an assembly (ARM) code. I read that I need to mention

import foo

in my assembly code, for assembler to search for foo in C file. But, I am stuck at passing arguments a and b from assembly and retrieving an integer (here 2) again back in assembly. Could someone could explain me how to do this, with a mini example?


Solution

  • You have already written the minimal example.

    int foo(int a, int b) {return 2;}
    

    compile and disassemble

    arm-none-eabi-gcc -O2 -c so.c -o so.o
    arm-none-eabi-objdump -d so.o
    
    so.o:     file format elf32-littlearm
    
    
    Disassembly of section .text:
    
    00000000 <foo>:
       0:   e3a00002    mov r0, #2
       4:   e12fff1e    bx  lr
    

    Anything to do with a and b are dead code so optimized out. While using C to learn asm is good/okay to get started you really want to do it with optimizations on which mean you have to work harder on crafting the experimental code.

    int foo(int a, int b) {return 2;}
    int bar ( void )
    {
        return(foo(5,4));
    }
    

    and we learn nothing new.

    Disassembly of section .text:
    
    00000000 <foo>:
       0:   e3a00002    mov r0, #2
       4:   e12fff1e    bx  lr
    
    00000008 <bar>:
       8:   e3a00002    mov r0, #2
       c:   e12fff1e    bx  lr
    

    need to do this for the call:

    int foo(int a, int b);
    int bar ( void )
    {
        return(foo(5,4));
    }
    

    and now we see

    00000000 <bar>:
       0:   e92d4010    push    {r4, lr}
       4:   e3a01004    mov r1, #4
       8:   e3a00005    mov r0, #5
       c:   ebfffffe    bl  0 <foo>
      10:   e8bd4010    pop {r4, lr}
      14:   e12fff1e    bx  lr
    

    (yes this is built for the this compilers default target armv4t, should be obvious to some others have no clue how I/we know)(can also tell how new/old the compiler is from this example as well (there was an abi change years ago that is visible here)(the newer versions of gcc are worse than older so older is still good to use for some use cases))

    per this compilers convention (now while this compiler does use the arm convention of some version of some document for some version of this compiler, always remember it is the compiler authors choice, they are under no obligation to conform to anyones written standard, they choose)

    So we see that the first parameter goes in r0, the second in r1. You can craft functions with more operands or more types of operands to see what nuances there are. How many are in registers and when they start using the stack instead. For example try a 64 bit variable then a 32 in that order as operands then try it in reverse.

    To see what is going on on the callee side.

    int foo(int a, int b)
    {
        return((a<<1)+b+0x123);
    }
    

    We see that r0 and r1 are the first two operands, the compiler would be grossly broken otherwise.

    00000000 <foo>:
       0:   e0810080    add r0, r1, r0, lsl #1
       4:   e2800e12    add r0, r0, #288    ; 0x120
       8:   e2800003    add r0, r0, #3
       c:   e12fff1e    bx  lr
    

    What we did not see explicitly in the caller example is that r0 is where the return is stored (at least for this variable type).

    The ABI documention is not an easy read, but if you first "just try it" then if you wish refer to the documentation it should help with the documentation. At the end of the day you have a compiler you are going to use, it has a convention and is probably part of a toolchain so you must conform to that compilers convention not some third party document (even if that third party is arm) AND you should probably use that toolchain's assembler which means you should use that assembly language (many incompatible assembly languages for arm, the tool defines the language not the target).

    You can see how simple it is to figure this out on your own.

    And...so this gets painful but you can look at the assembly output of the compiler, at least some will let you. With gcc you can use -save-temps or -S

    int foo(int a, int b)
    {
        return 2;
    }
    
        .cpu arm7tdmi
        .eabi_attribute 20, 1
        .eabi_attribute 21, 1
        .eabi_attribute 23, 3
        .eabi_attribute 24, 1
        .eabi_attribute 25, 1
        .eabi_attribute 26, 1
        .eabi_attribute 30, 2
        .eabi_attribute 34, 0
        .eabi_attribute 18, 4
        .file   "so.c"
        .text
        .align  2
        .global foo
        .arch armv4t
        .syntax unified
        .arm
        .fpu softvfp
        .type   foo, %function
    foo:
        @ Function supports interworking.
        @ args = 0, pretend = 0, frame = 0
        @ frame_needed = 0, uses_anonymous_args = 0
        @ link register save eliminated.
        mov r0, #2
        bx  lr
        .size   foo, .-foo
        .ident  "GCC: (15:9-2019-q4-0ubuntu1) 9.2.1 20191025 (release) [ARM/arm-9-branch revision 277599]"
    

    Almost none of this do you "need".

    The minimum looks like this

    .globl foo
    foo:
        mov r0,#2
        bx lr
    

    .global or .globl are equivalent, somewhat reflects the age or how/when you learned gnu assembler.

    Now this will break if you are mixing arm and thumb instructions, this defaults to arm.

    arm-none-eabi-as x.s -o x.o arm-none-eabi-objdump -d x.o

    x.o: file format elf32-littlearm

    Disassembly of section .text:

    00000000 : 0: e3a00002 mov r0, #2 4: e12fff1e bx lr

    If we want thumb then we have to tell it

    .thumb
    .globl foo
    foo:
        mov r0,#2
        bx lr
    

    and we get thumb.

    00000000 <foo>:
       0:   2002        movs    r0, #2
       2:   4770        bx  lr
    

    With ARM and with the gnu toolchain at least you can mix arm and thumb and the linker will take care of the transition

    int foo ( int, int );
    int fun ( void )
    {
        return(foo(1,2));
    }
    

    we do not need a bootstrap nor other things to get the linker to link so we can see how that part of it works.

    arm-none-eabi-ld so.o x.o -o so.elf
    arm-none-eabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000008000
    arm-none-eabi-objdump -d so.elf
    
    so.elf:     file format elf32-littlearm
    
    
    Disassembly of section .text:
    
    00008000 <fun>:
        8000:   e92d4010    push    {r4, lr}
        8004:   e3a01002    mov r1, #2
        8008:   e3a00001    mov r0, #1
        800c:   eb000001    bl  8018 <foo>
        8010:   e8bd4010    pop {r4, lr}
        8014:   e12fff1e    bx  lr
    
    00008018 <foo>:
        8018:   2002        movs    r0, #2
        801a:   4770        bx  lr
    

    Now this is broken not just because we have no bootstrap, etc, but there is a bl to foo but foo is thumb and the caller is arm. So for gnu assembler for arm you can take this shortcut which I think I learned from an older gcc, but whatever

    .thumb
    
    .thumb_func
    .globl foo
    foo:
        mov r0,#2
        bx lr
    

    .thumb_func says the next label you find is considered a function label not just an address.

    00008000 <fun>:
        8000:   e92d4010    push    {r4, lr}
        8004:   e3a01002    mov r1, #2
        8008:   e3a00001    mov r0, #1
        800c:   eb000003    bl  8020 <__foo_from_arm>
        8010:   e8bd4010    pop {r4, lr}
        8014:   e12fff1e    bx  lr
    
    00008018 <foo>:
        8018:   2002        movs    r0, #2
        801a:   4770        bx  lr
        801c:   0000        movs    r0, r0
        ...
    
    00008020 <__foo_from_arm>:
        8020:   e59fc000    ldr ip, [pc]    ; 8028 <__foo_from_arm+0x8>
        8024:   e12fff1c    bx  ip
        8028:   00008019    .word   0x00008019
        802c:   00000000    .word   0x00000000
    

    The linker adds a trampoline as I call it, I think others call it a vaneer. Either way the toolchain took care of is so long as we write the code right.

    Remember and in particular this syntax for the assembler is very much assembler specific other assemblers may have other syntax to make this work. From the gcc generated code we see the generic solution which is more typing but probably a better habit.

    .thumb
    
    .type foo, %function
    .global foo
    foo:
        mov r0,#2
        bx lr
    

    the .type foo, %function works for both arm and thumb in gnu assembler for arm. And it does not have to be positioned just before the labe (just like .globl or .global does not either. We get the same result from the toolchain with this assembly language.

    Just for demonstration...

    arm-none-eabi-as x.s -o x.o
    arm-none-eabi-gcc -O2 -mthumb -c so.c -o so.o
    arm-none-eabi-ld so.o x.o -o so.elf
    arm-none-eabi-ld: warning: cannot find entry symbol _start; defaulting to 0000000000008000
    arm-none-eabi-objdump -d so.elf
    
    so.elf:     file format elf32-littlearm
    
    
    Disassembly of section .text:
    
    00008000 <fun>:
        8000:   b510        push    {r4, lr}
        8002:   2102        movs    r1, #2
        8004:   2001        movs    r0, #1
        8006:   f000 f807   bl  8018 <__foo_from_thumb>
        800a:   bc10        pop {r4}
        800c:   bc02        pop {r1}
        800e:   4708        bx  r1
    
    00008010 <foo>:
        8010:   e3a00002    mov r0, #2
        8014:   e12fff1e    bx  lr
    
    00008018 <__foo_from_thumb>:
        8018:   4778        bx  pc
        801a:   e7fd        b.n 8018 <__foo_from_thumb>
        801c:   eafffffb    b   8010 <foo>
    

    And you can see it works both ways thumb to arm arm to thumb if we write the asm write it does the rest of the work for us.

    Now I personally hate the unified syntax, it is one of the major mistakes arm has made along with CMSIS. But, you want to do this for a living you find that you pretty much hate most corporate decisions and worse, have to work/operate with them. Often the time unified syntax generates the wrong instruction and have to fiddle with the syntax to get it to work, but if I have to get a specific instruction then I have to fiddle about to get it to generate the specific instruction I am after. Other than a bootstrap and some other exceptions you do not often write assembly language anyway, usually compile something then take the compiler generated code and tune it or replace it.

    I started with the arm gnu tools before unified syntax so I am used to

    .thumb
    .globl hello
    hello:
      sub r0,#1
      bne hello
    

    instead of

    .thumb
    .globl hello
    hello:
      subs r0,#1
      bne hello
    

    And fine with bouncing between the two syntaxes (unified and not, yes two assembly languages within one tool).

    All of the above is with the 32 bit arm, if you are interested in 64 bit arm, AND using gnu tools, then a percentage of this still applies, you just need to use the aarch64 tools not the arm tools from gnu. ARM's aarch64 is a completely different, and incompatible, instruction set from aarch32. But gnu syntax like .global and .type...function are often used across all gnu supported targets. There are exceptions for some directives, but if you take the same approach of having the tools themselves tell you how they work...by using them...You can figure this out.

    so.elf:     file format elf64-littleaarch64
    
    
    Disassembly of section .text:
    
    0000000000400000 <fun>:
      400000:   52800041    mov w1, #0x2                    // #2
      400004:   52800020    mov w0, #0x1                    // #1
      400008:   14000001    b   40000c <foo>
    
    000000000040000c <foo>:
      40000c:   52800040    mov w0, #0x2                    // #2
      400010:   d65f03c0    ret