Search code examples
assemblyarmcortex-mthumb

How to interpret the assembly boot code with ".word"


I'm slowly studying in step by step the boot code within assembly.

I found the below assembly boot code. but I've still problem to understand completely.

So far, as I understood,

First of all, after line 1 execute,

then go to line 2,

then go to line 5,

then go to line 6, load 0x40002001 to r1 register,

then go to line 7, branch to r1's register address ( PC is updated as 0x40002001)

till here what I understood.

But I can't understand line 3 and line 4's purpose and meaning .

Would you please let me know what are line 3 and 4 having a meaning and purpose?

1    .section .text                                                             
2        .globl main

3        .word   0x40002000 
4        .word   main+1

5     main:                                                                      
6         ldr   r1, st0                                                       
7         bx    r1           

8        .align 4
9     st0:                                                                   
10       .word   0x40002001 

Solution

  • You need to read the documentation for the cortex-m3. This is from ARM's website not necessarily the chip vendors.

    The short answer is the .words are there to describe the vector table, these are not instructions. The .align whose arguments meaning can vary, is there to make sure the constant is aligned right to avoid a data abort.

    The first item/word at address 0x00000000 is a value loaded into the stack pointer on reset. The second word is the reset vector. Being a thumb instruction set machine this wants to have an address with the lsbit set in the vector table.

    Then the code branches to 0x40002000 the lsbit is stripped off but is necessary for the bx to work. (the PC gets 0x40002000 not 0x40002001)

    You didnt specify the assembly language (assembler), if this is going to use gnu assembler then you can clean it up some:

    .cpu cortex-m3
    .thumb
    
        .globl _start
    _start:
        .word   0x40002000
        .word   main
    
        .thumb_func
    main:
         ldr   r1, st0
         bx    r1
    
        .align
    st0:
        .word   0x40002001
    

    Producing

    Disassembly of section .text:
    
    00000000 <_start>:
       0:   40002000    andmi   r2, r0, r0
       4:   00000009    andeq   r0, r0, r9
    
    00000008 <main>:
       8:   4900        ldr r1, [pc, #0]    ; (c <st0>)
       a:   4708        bx  r1
    
    0000000c <st0>:
       c:   40002001    andmi   r2, r0, r1
    

    Now the question is how are you getting the program into memory at 0x40002000 before reset?

    You can do this trick in gnu assembler to the .align confusion or possible waste (of using the wrong value after .align)

    .cpu cortex-m3
    .thumb
    
        .globl _start
    _start:
        .word   0x40002000
        .word   main
    
        .thumb_func
    main:
         ldr   r1, =0x40002001
         bx    r1
    
    Disassembly of section .text:
    
    00000000 <_start>:
       0:   40002000    andmi   r2, r0, r0
       4:   00000009    andeq   r0, r0, r9
    
    00000008 <main>:
       8:   4900        ldr r1, [pc, #0]    ; (c <main+0x4>)
       a:   4708        bx  r1
       c:   40002001    andmi   r2, r0, r1
    

    the alignment is to avoid something like this causing a fault:

    .cpu cortex-m3
    .thumb
    
        .globl _start
    _start:
        .word   0x40002000
        .word   main
    
        .thumb_func
    main:
         nop
         ldr   r1, st0
         bx    r1
    
    .align
    st0: .word 0x40002001
    

    the .align caused padding to be added where above it didnt need to because the value landed in a place it was aligned.

    Disassembly of section .text:
    
    00000000 <_start>:
       0:   40002000    andmi   r2, r0, r0
       4:   00000009    andeq   r0, r0, r9
    
    00000008 <main>:
       8:   46c0        nop         ; (mov r8, r8)
       a:   4901        ldr r1, [pc, #4]    ; (10 <st0>)
       c:   4708        bx  r1
       e:   bf00        nop
    
    00000010 <st0>:
      10:   40002001    andmi   r2, r0, r1
    

    this also aligns for you, pads with zeros in this case, didnt use an align, let it place the data in a pool

    .cpu cortex-m3
    .thumb
    
        .globl _start
    _start:
        .word   0x40002000
        .word   main
    
        .thumb_func
    main:
         nop
         ldr   r1, =0x40002001
         bx    r1
    

    It doesnt disassemble well so the intel hex output form shows what happened.

    :100000000020004009000000C046014908470000E8
    :04001000012000408B
    :00000001FF
    

    same as above but the padding is 0x0000

    0x4708 (bx r1)
    0x0000
    0x40002001