I am learning the arm assembly language, using qemu vexpress-a9 as virtual arm cpu and the GNU as to assemble. This is my code:
... @ some vector table code
.section .text
Reset_Handler: @ 0x60010120
@ldr sp, = SRAM_BASE
ldr r10, =0x1111111 @ I know it is 0x01111111
ldr r12, =0x2222222
ldr r5, =0x3333333
.ltorg
ldr r11, =0x4444444
ldr r11, =0x5555555
stop:
b stop
After assemble, link, objcopy and run in qemu, I got the .bin file and starting at ram address 0x60010120.
@ This is the result of gdb command x/20x 0x60010120!!!
0x60010120: 0xe59fa004 0xe59fc004 0xe59f5004 0x01111111
0x60010130: 0x02222222 0x03333333 0xe59fb004 0xe59fb004
0x60010140: 0xeafffffe 0x04444444 0x05555555 0x00000000
The data at address from 0x6001012C to 0x60010134 is the numeric that I set in code. I supposed the program would corrupt at 0x6001012C. It is not an instruction but data.
However, the program ended at stop: b stop
instruction. I stepped from Reset_Handler. The result got from gdb made me confused.
(gdb) ni
_Reset () at startup.s:8
8 b Reset_Handler
(gdb) ni
SRAM_BASE () at startup.s:22
22 ldr r10, =0x1111111
(gdb) i r pc
pc 0x60010120 0x60010120 <SRAM_BASE>
(gdb) ni
23 ldr r12, =0x2222222
(gdb) i r pc
pc 0x60010124 0x60010124 <SRAM_BASE+4>
(gdb) ni
24 ldr r5, =0x3333333
(gdb) i r pc
pc 0x60010128 0x60010128 <SRAM_BASE+8>
(gdb) ni
22 ldr r10, =0x1111111
(gdb) i r pc
pc 0x6001012c 0x6001012c <SRAM_BASE+12>
(gdb) ni
23 ldr r12, =0x2222222
(gdb) i r pc
pc 0x60010130 0x60010130 <SRAM_BASE+16>
(gdb) ni
24 ldr r5, =0x3333333
(gdb) i r pc
pc 0x60010134 0x60010134 <SRAM_BASE+20>
(gdb) ni
26 ldr r11, =0x4444444
(gdb) i r pc
pc 0x60010138 0x60010138 <SRAM_BASE+24>
(gdb) ni
27 ldr r11, =0x5555555
(gdb) i r pc
pc 0x6001013c 0x6001013c <SRAM_BASE+28>
(gdb) ni
stop () at startup.s:29
29 b stop
As we can see, ldr instruction befor .ltorg
execute twice. Why is the data in ram 0x01111111 but the command executed in cpu is ldr r10, =0x1111111
in line 22? I supposed the program would corrupt at line 22.
Shortly, you get luck... That just happens that 0x01111111 0x02222222 0x03333333
are valid instructions.
Now let's elaborate. I run following code on ARMv7 (Cortex-A9, SoC Zynq-7000).
void test_so() __attribute__((naked));
void test_so()
{
asm volatile
(
"ldr r0, =0x1111111 \n\t"
"ldr r1, =0x2222222 \n\t"
"ldr r2, =0x3333333 \n\t"
".ltorg \n\t"
"add r3, r0, r1 \n\t"
// crash it
"mov r3, #0 \n\t"
"ldr r3, [r3] \n\t"
:::"memory", "r0", "r1", "r2", "r3"
);
}
...
printf("test start.\n");
test_so();
printf("test end.\n");
test_so
disassembly with GNU objdump
01a281b4 <test_so()>:
1a281b4: e59f0004 ldr r0, [pc, #4] ; 1a281c0 <test_so()+0xc>
1a281b8: e59f1004 ldr r1, [pc, #4] ; 1a281c4 <test_so()+0x10>
1a281bc: e59f2004 ldr r2, [pc, #4] ; 1a281c8 <test_so()+0x14>
1a281c0: 01111111 tsteq r1, r1, lsl r1
1a281c4: 02222222 eoreq r2, r2, #536870914 ; 0x20000002
1a281c8: 03333333 teqeq r3, #-872415232 ; 0xcc000000
1a281cc: e0803001 add r3, r0, r1
1a281d0: e3a03000 mov r3, #0, 0
1a281d4: e5933000 ldr r3, [r3]
As you could see objdump
actually shows values in memory pool as instructions.
Result of execution of this code with intentional crash is
test start.
...
Type: Data Abort
...
---
r0: 0x1111111
r1: 0x2222222
r2: 0x3333333
r3: 0x0
...
r13(sp): 0x149b8
r14(lr): 0x1a28200
r15(pc): 0x1a281d4 <-- address of Instruction causing a crash
---
So CPU executed questioned instructions (which are data in memory pool) and crashed as planned on
1a281d4: e5933000 ldr r3, [r3]
(null dereferencing, r3
is zero)
For an extra fun let's make an undefined instruction abort with following code
asm volatile
(
"ldr r0, =0x1111111 \n\t"
"ldr r1, =0x2222222 \n\t"
"ldr r2, =0x3333333 \n\t"
".ltorg \n\t"
"add r3, r0, r1 \n\t"
// crash it
"udf #1 \n\t" <-- undefined instruction
:::"memory", "r0", "r1", "r2", "r3"
);
Disassembly is pretty much same with exception that null dereferencing is replaced with undefined instruction udf
01a281b4 <test_so()>:
1a281b4: e59f0004 ldr r0, [pc, #4] ; 1a281c0 <test_so()+0xc>
1a281b8: e59f1004 ldr r1, [pc, #4] ; 1a281c4 <test_so()+0x10>
1a281bc: e59f2004 ldr r2, [pc, #4] ; 1a281c8 <test_so()+0x14>
1a281c0: 01111111 tsteq r1, r1, lsl r1
1a281c4: 02222222 eoreq r2, r2, #536870914 ; 0x20000002
1a281c8: 03333333 teqeq r3, #-872415232 ; 0xcc000000
1a281cc: e0803001 add r3, r0, r1
1a281d0: e7f000f1 udf #1
Running this code would crash like
test start.
...
Type: Undefined Instruction Abort
...
---
r0: 0x1111111
r1: 0x2222222
r2: 0x3333333
r3: 0x3333333
...
r13(sp): 0x149b8
r14(lr): 0x1a281fc
r15(pc): 0x1a281d0 <-- address of Instruction causing a crash
---
So in this case that's a real instruction abort caused by
1a281d0: e7f000f1 udf #1
PS: It seems that my first assumption about buggy emulator was wrong after all.
test_so():
1a281b4: 04 00 9f e5 ldr r0, [pc, #4]
1a281b8: 04 10 9f e5 ldr r1, [pc, #4]
1a281bc: 04 20 9f e5 ldr r2, [pc, #4]
$d.4:
1a281c0: 11 11 11 01 .word 0x01111111
1a281c4: 22 22 22 02 .word 0x02222222
1a281c8: 33 33 33 03 .word 0x03333333
$a.5:
1a281cc: 01 30 80 e0 add r3, r0, r1
1a281d0: f1 00 f0 e7 udf #1