I found the following piece of code in u-boot/arch/arm/lib/semihosting.c that uses bkpt
and other instructions and provides input and output operands even though they are not specified in the ASM template:
static noinline long smh_trap(unsigned int sysnum, void *addr)
{
register long result asm("r0");
#if defined(CONFIG_ARM64)
asm volatile ("hlt #0xf000" : "=r" (result) : "0"(sysnum), "r"(addr));
#elif defined(CONFIG_CPU_V7M)
asm volatile ("bkpt #0xAB" : "=r" (result) : "0"(sysnum), "r"(addr));
#else
/* Note - untested placeholder */
asm volatile ("svc #0x123456" : "=r" (result) : "0"(sysnum), "r"(addr));
#endif
return result;
}
Minimal, verifiable example:
#include <stdio.h>
#include <stdlib.h>
int main(void)
{
register long result asm("r0");
void *addr = 0;
unsigned int sysnum = 0;
__asm__ volatile ("bkpt #0xAB" : "=r" (result) : "0"(sysnum), "r"(addr));
return EXIT_SUCCESS;
}
According to ARM Architecture Reference Manual bkpt
instruction
takes a single imm parameter and according to my reading of GCC manual
section on inline assembly GCC does not allow providing operands if they
are not specified in the template. Output assembly generated with -S
:
.arch armv6
.eabi_attribute 28, 1
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 6
.eabi_attribute 34, 1
.eabi_attribute 18, 4
.file "bkpt-so.c"
.text
.align 2
.global main
.arch armv6
.syntax unified
.arm
.fpu vfp
.type main, %function
main:
@ args = 0, pretend = 0, frame = 8
@ frame_needed = 1, uses_anonymous_args = 0
@ link register save eliminated.
str fp, [sp, #-4]!
add fp, sp, #0
sub sp, sp, #12
mov r3, #0
str r3, [fp, #-8]
mov r3, #0
str r3, [fp, #-12]
ldr r2, [fp, #-12]
ldr r3, [fp, #-8]
mov r0, r2
.syntax divided
@ 10 "bkpt-so.c" 1
bkpt #0xAB
@ 0 "" 2
.arm
.syntax unified
mov r3, #0
mov r0, r3
add sp, fp, #0
@ sp needed
ldr fp, [sp], #4
bx lr
.size main, .-main
.ident "GCC: (Raspbian 8.3.0-6+rpi1) 8.3.0"
.section .note.GNU-stack,"",%progbits
So what's the point of "=r" (result) : "0"(sysnum), "r"(addr)
in this line:
__asm__ volatile ("bkpt #0xAB" : "=r" (result) : "0"(sysnum), "r"(addr));
?
Despite the fact that this code exists in a well known project like U-BOOT does not instill confidence. The code is relying on the fact that with the ARM architectures that the ABI (call standard) passes the first 4 scalar arguments in r0
(argument 1), r1
(argument 2), r2
(argument 3), and r3
(argument 4).
Table 6.1 summarizes the ABI:
The assumption that the U-BOOT code is making is that addr
which was passed to the function in r1
is still the same value when the inline assembly is generated. I consider this dangerous because even with a simple non-inlined function GCC doesn't guarantee this behaviour. My view is that this code is fragile although it probably has never presented a problem but in theory it could. Relying on underlying compiler code generation behaviour is not a good idea.
I believe it would have been better written as:
static noinline long smh_trap(unsigned int sysnum, void *addr)
{
register long result asm("r0");
register void *reg_r1 asm("r1") = addr;
#if defined(CONFIG_ARM64)
asm volatile ("hlt #0xf000" : "=r" (result) : "0"(sysnum), "r"(reg_r1) : "memory");
#elif defined(CONFIG_CPU_V7M)
asm volatile ("bkpt #0xAB" : "=r" (result) : "0"(sysnum), "r"(reg_r1) : "memory");
#else
/* Note - untested placeholder */
asm volatile ("svc #0x123456" : "=r" (result) : "0"(sysnum), "r"(reg_r1) : "memory");
#endif
return result;
}
This code passes addr
through a variable (reg_r1
) that will be put into register r1
for the purposes of an inline assembly constraint. On higher optimizations levels the compiler would not generate any extra code with the extra variable. I have also placed a memory
clobber because it is not a good idea to pass a memory address through a register in this way without one. This poses an issue if someone were to make an inlined version of this function. The memory clobber will ensure that any data is realized into memory before the inline assembly is run and if necessary reloaded when necessary afterwards.
As for the question about what "=r" (result) : "0"(sysnum), "r"(addr)
does is:
"=r"(result)
is an output constraint that tells compiler that the value in register r0
after the inline assembly completes will be placed in variable addr
"0"(sysnum)
is an input constraint that tells compiler that sysnum
will be passed into the inline assembly code through the same register as constraint 0 (constraint 0 is using register r0
). "r"(addr)
passes addr
through a register and the assumption is that it will be in r1
with the U-BOOT code. In my version it is explicitly defined that way.Information on operands and constraints for extended inline assembly can be found in the GCC documentation. You can find additional machine specific constraints here.
hlt
, bkpt
, and svc
are all being used as system calls to have a system service performed through the debugger (semihosting). You can find more documentation on semihosting here. The different ARM architectures use a slightly different mechanism. The convention for a semihosting system call is that r0
contains the system call number; r1
contains the first argument of the system call; the system call places a return value in r0
before returning to user code.