This is a function in C program (including in-line assembly) compiled and run ok with gcc.
extern unsigned char __heap_base; extern unsigned char __heap_limit; caddr_t _sbrk_r (struct _reent *r,int incr) { extern unsigned char __bottom_of_heap asm ("__heap_base"); extern unsigned char __limit_of_heap asm ("__heap_limit"); register unsigned char *__stack_ptr asm ("sp"); ... }
In another assembly file, the __heap_base, __heap_limit are defined as below(excerpts).
.global __heap_base
.global __heap_limit
__user_thread_space:
.equ __heap_base, __user_thread_space + THREAD_AREA
.equ __heap_limit, __heap_base + HEAP_SIZE
https://gcc.gnu.org/onlinedocs/gcc/Asm-Labels.html is setting the asm symbol names for those C vars. This lets you set the asm name regardless of whether the compiler normally prepends an extra _
or not, for example, or to override C++ name mangling. Or in this case simply to use a different asm symbol name than the C var name.
The register ... asm("regname")
global variable for the stack pointer is also documented: https://gcc.gnu.org/onlinedocs/gcc/Global-Register-Variables.html.
Assigning to that variable would certainly break things, and even reading it doesn't have fully well-defined / guaranteed semantics wrt. when the compiler modifies SP itself to make space for locals, push args for a function with lots of args, or wrt. alloca. For example, I wouldn't be surprised if CSE of __stack_ptr
was possible even in a loop that used alloca
, letting the compiler assume a C variable had the same value every time it was read, if you don't explicitly write it. But I'd also expect whatever value you get to be somewhere in the current function's stack frame, which is about as much as one can meaningfully say about the asm stack pointer in a C program. I certainly wouldn't try to write a context-switch function that did __stack_ptr = new_stack;
Note that the syntax for these two uses of the asm keyword is essentially identical: both are sort of telling the compiler: when you print a reference to this variable in the asm source you're making, use this name. (Of course it has to know whether it's a register or a symbol, especially on a load/store machine not a CISC).
Why is extern variable declared inside a function?
I assume just to limit the scope of those declarations.
In normal ISO C, you can extern char bar;
inside a function, and the compiler will reference global variable bar
. (At least gcc -Wall -Wextra -Wpedantic
doesn't have any complaints, and you can see from the ASM that without any asm()
override, it's using plain bar
as the asm symbol name: https://godbolt.org/z/j1KznMEPG)
I.e. it's referencing a variable in static storage with that name, with the C declaration scoped to this function instead of global scope.
(BTW, a static
var inside a function normally has an asm name that involves the function's name, so it doesn't conflict with a static var of the same name in another function. Of course, this is extern
not static
, so it makes sense it's using a plain global name.)