Search code examples
assemblyarmcalling-convention

Procedure Call Standard for the ARM® Architecture: 2 separate, but related return values


Function readWord should return:

  • status code (error or success, e.g., 0 = success, > 0 error codes)
  • data read from a data source (only valid if the status code represents success)

Is the following example following AAPCS? Is there a better way to do this considering AAPCS?

// OUT:
//  R0 (status code): 0 if valid, else > 0
//  R1 (data):  
//  if the R0 represents a successful read: data, which was read,
//  else: undetermined
read32:
    PUSH    {LR}

    LDR     R0, =memoryMappedAddressOfDataSource
    LDR     R4, [R0]
    BL      checkValidRead
    // If the read was invalid, directly return the error 
    // code set by checkValidRead in R0 and do not change R1.
    CBNZ    R0, read32_return

    // R0 is 0, so the read was valid and the the data is returned in R1.
    MOV     R1, R4

read32_return:  
    POP     {PC}

// IN: none
// Checks a special status register to determine, 
// whether the last read was successful.    
// OUT:
//  R0: 0 if valid, else > 0
checkValidRead:
    ...

From the AAPCS (page 18):

A double-word sized Fundamental Data Type (e.g., long long, double and 64-bit containerized vectors) is returned in r0 and r1.

A Composite Type larger than 4 bytes, or whose size cannot be determined statically by both caller and callee, is stored in memory at an address passed as an extra argument when the function was called (§5.5, rule A.4). The memory to be used for the result may be modified at any point during the function call.

However, I do not know if it is rather a containerized 64-bit vector or an aggregate composite type or even sth. else:

The content of a containerized vector is opaque to most of the procedure call standard: the only defined aspect of its layout is the mapping between the memory format (the way a fundamental type is stored in memory) and different classes of register at a procedure call interface.

A Composite Type is a collection of one or more Fundamental Data Types that are handled as a single entity at the procedure call level. A Composite Type can be any of: An aggregate, where the members are laid out sequentially in memory [...]


Solution

  • The doc you quoted says that any composite types which don't fit in a single register are returned via hidden pointer. This would include a C struct.

    Only a single wide integer or FP type can be returned in a register pair.

    A register pair is more efficient than store/reload through a hidden pointer, so it's unfortunate that you have to hack around the calling convention instead of just returning a struct { uint32_t flag, value; }

    To describe the calling convention you want to a C compiler, you tell it you're returning a uint64_t, and split that up into two 32-bit integer variables. This will happen for free because the compiler already has them in separate registers.

    For example (source + asm on the Godbolt compiler explorer). I used a union, but you could equally use a shift.

    #include <stdint.h>
    
    uint64_t read32(void);
    
    union unpack32 {
        uint64_t u64;
        uint32_t u32[2];
    };
    
    void ext(uint32_t);        // something to do with a return value
    
    unsigned read32_wrapper() {
        union unpack32 ret = { read32() };
        if (ret.u32[0]) {
            ext(ret.u32[1]);
        }
        return ret.u32[0];
    }
    

    compiles like this:

        push    {r4, lr}
        bl      read32
        subs    r4, r0, #0          @ set flags and copy the flag to r4
    
        movne   r0, r1
        blne    ext                 @ the if() body.
    
        mov     r0, r4              @ always return the status flag
        pop     {r4, lr}
        bx      lr