Search code examples
assemblyarmstm32machine-codeinstruction-set

By reading .hex and .map, how can I be sure that a BL links to the right function offset?


I'm currently doing an "hex compare" for fun to understand what is happening.

I know that comparing hex sometimes gives too much changes to be compared.

By just changing a function call, I can make a small change happen in the hex. My embedded code contains Foo(5);, which I replaced it by Bar(5); (the signature of which is identical), then by Bla(5);.

When I compare the hex files I have the following:

enter image description here

The part in green is the CRC.

With the help of the hex file and the map file, how can I be sure that Foo has indeed been replaced by Bar or Bla, and not by another function?

enter image description here

Here's what I found that in the ARMv7-M arch doc link. But even after knowing the offset, I still don't know if I can figure something out of this...how to transcript the .map addresses in machine code?

In the .map, respective addresses are:

enter image description here

I'm working on a STM32L4xx (cortex M4) with IAR compiler.


Solution

  • As always you need the arm architectural reference manual (as well as technical reference manual) for that core. The cortex-m4 trm will tell you that it is an armv7-m architecture and you get the arm for that.

    BL is two separate instructions even though it is often shown as if it were 32, they don't have to execute back to back but have to execute in the right order (and you cant mess with lr in between):

    11110xxxxxxxxxxx
    11111xxxxxxxxxxx
    

    so

    0xF0E6
    
    00011100110000000000000
    
    0xFC03
    
    00011100110000000000000
               100000000110
    00011100110100000000110
    00011100110100000000110
    
    000 1110 0110 1000 0000 0110
    0x0E6806+4+PC
    

    Or write a program (doesn't handle the sign extension on H=10 because you don't have those bits set):

    #include <stdio.h>
    void fun ( unsigned int a, unsigned int b )
    {
        unsigned int ra;
        ra=a&0x7FF;
        ra<<=12;
        ra|=(b&0x7FF)<<1;
        ra&=0xFFFFFFFE;
        ra+=4;
        printf("PC(inst)+0x%08X\n",ra);
    }
    int main ( void )
    {
        fun(0xF0E6,0xFC03);
        fun(0xF008,0xFD03);
        fun(0xF0C7,0xFC03);
        return(1);
    }
    

    gives:

    PC(inst)+0x000E680A
    PC(inst)+0x00008A0A
    PC(inst)+0x000C780A
    

    What is the address of the bl instruction in question? Being an Intel hex file it cannot be determined from what you have provided. (I prefer srec with the 32 bit address).

    The older/est ARM ARM, the armv5 one is better for seeing the two instructions, but there is a typo/bug in that document the thumb version you don't strip off two lower bits that would be the arm version of that instruction.

    .thumb
    bl here
    nop
    bl here
    nop
    nop
    nop
    here:
    
    
    
    00000000 <here-0x10>:
       0:   f000 f806   bl  10 <here>
       4:   46c0        nop         ; (mov r8, r8)
       6:   f000 f803   bl  10 <here>
       a:   46c0        nop         ; (mov r8, r8)
       c:   46c0        nop         ; (mov r8, r8)
       e:   46c0        nop         ; (mov r8, r8)
    
    
    PC(inst)+0x00000010
    PC(inst)+0x0000000A
    

    Note that with GNU assembler you can't just use the .word or .hword trick, you have to use .inst.n:

    .thumb
    bl here
    nop
    bl here
    nop
    nop
    nop
    here:
    .inst.n 0xf000
    .inst.n 0xf806
    
    00000000 <here-0x10>:
       0:   f000 f806   bl  10 <here>
       4:   46c0        nop         ; (mov r8, r8)
       6:   f000 f803   bl  10 <here>
       a:   46c0        nop         ; (mov r8, r8)
       c:   46c0        nop         ; (mov r8, r8)
       e:   46c0        nop         ; (mov r8, r8)
    
    00000010 <here>:
      10:   f000 f806   bl  20 <here+0x10>
    

    with one of yours:

    .thumb
    .inst.n 0xF0E6
    .inst.n 0xFC03
    
    00000000 <.text>:
       0:   f0e6 fc03   bl  e680a <.text+0xe680a>