Search code examples
cgccinline-assembly

Inline assembly statements in the variable declarations and extern variable declarations inside a function


This is a function in C program (including in-line assembly) compiled and run ok with gcc.

  1. What is the asm statements at the end of variable declarations?
  2. Why is extern variable declared inside a function? What effect does it have compared to being declared outside the function?
extern unsigned char __heap_base;
extern unsigned char __heap_limit;

caddr_t _sbrk_r (struct _reent *r,int incr)
{
extern   unsigned char  __bottom_of_heap asm ("__heap_base");
extern   unsigned char  __limit_of_heap  asm ("__heap_limit");
register unsigned char *__stack_ptr  asm ("sp");
...
}

In another assembly file, the __heap_base, __heap_limit are defined as below(excerpts).

                .global  __heap_base
                .global  __heap_limit

__user_thread_space:
                .equ     __heap_base,  __user_thread_space + THREAD_AREA
                .equ     __heap_limit, __heap_base         + HEAP_SIZE

Solution

  • https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html is setting the asm symbol names for those C vars. This lets you set the asm name regardless of whether the compiler normally prepends an extra _ or not, for example, or to override C++ name mangling. Or in this case simply to use a different asm symbol name than the C var name.


    The register ... asm("regname") global variable for the stack pointer is also documented: https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html.

    Assigning to that variable would certainly break things, and even reading it doesn't have fully well-defined / guaranteed semantics wrt. when the compiler modifies SP itself to make space for locals, push args for a function with lots of args, or wrt. alloca. For example, I wouldn't be surprised if CSE of __stack_ptr was possible even in a loop that used alloca, letting the compiler assume a C variable had the same value every time it was read, if you don't explicitly write it. But I'd also expect whatever value you get to be somewhere in the current function's stack frame, which is about as much as one can meaningfully say about the asm stack pointer in a C program. I certainly wouldn't try to write a context-switch function that did __stack_ptr = new_stack;


    Note that the syntax for these two uses of the asm keyword is essentially identical: both are sort of telling the compiler: when you print a reference to this variable in the asm source you're making, use this name. (Of course it has to know whether it's a register or a symbol, especially on a load/store machine not a CISC).


    Why is extern variable declared inside a function?

    I assume just to limit the scope of those declarations.

    In normal ISO C, you can extern char bar; inside a function, and the compiler will reference global variable bar. (At least gcc -Wall -Wextra -Wpedantic doesn't have any complaints, and you can see from the ASM that without any asm() override, it's using plain bar as the asm symbol name: https://godbolt.org/z/j1KznMEPG)

    I.e. it's referencing a variable in static storage with that name, with the C declaration scoped to this function instead of global scope.

    (BTW, a static var inside a function normally has an asm name that involves the function's name, so it doesn't conflict with a static var of the same name in another function. Of course, this is extern not static, so it makes sense it's using a plain global name.)