Search code examples
assemblyreverse-engineeringmachine-codemicroprocessors

How can I tell what architecture a microprocessor is from the assembled code?


I have a device I am trying to reverse-engineer. I am trying to avoid opening the physical device up, so I sniffed the packets received while doing a firmware update. However I have no clue what architecture this microprocessor is.

Here's a sample:

df f8 0c d0 01 f0 06 f8 20 48 20 47 31 15 06 20 f8 f5 03 20 09 4b 1e f0 04 0f 1c bf ef f3 09 80 18 47 ef f3 08 80 06 49 06 4a 88 42 01 d8 90 42 02 d8 8d 46 4f f0 20 20 18 47 20 20 51 77 03 20 f8 f5 03 20 70 e2 03 20 30 b4 50 e8 01 2f 93 b2 c4 89 23 44 84 89 a3 42 28 bf 1b 1b 04 89 a3 42 02 bf bf f3 2f 8f 20 20 07 e0 c3 ea 02 03 40 e8 01 34 20 2c e9 d1 4f f0 01 20 0a 60 30 bc 70 47 50 e8 01 2f c2 ea 02 42 40 e8 01 21 20 29 f7 d1 70 47 30 b4 50 e8 02 2f 1f fa a2 f3 c4 88 a3 42 02 bf bf f3 2f 8f 20 20 0d e0 c4 89 23 44 84 89 a3 42 28 bf 1b 1b c2 ea 03 43 40 e8 02 34 20 2c e8 d1 4f f0 01 20 0a 60 30 bc 70 47 50 e8 02 2f c2 ea 22 42 40 e8 02 21 20 29 f7 d1 70 47 03 46 53 e8 02 2f d9 88 b2 eb 32 4f 4f f0 20 20 17 bf 92 b2 41 ea 01 41 42 ea 01 41 04 e0 5a 68 b2 eb 32 4f 08 bf 01 20 43 e8 02 12 20 2a e8 d1 70 47 30 b4 04 46 54 e8 20 0f 0d 46 44 e8 20 53 20 2b f8 d1 15 60 30 bc 70 47 30 b4 04 46 54 e8 20 0f 40 ea 01 05 44 e8 20 53 20 2b f7 d1 15 60 30 bc

I have an online disassembler, and am currently trying each architecture I see in the long list (there are a lot). I'm hoping there's an easier way to do this, or if there's any skilled ASM-ers out there that can just recognize the patterns by heart.

Near the end of the long hex dump, there are some references to C code, so whatever it is, there is a compiler out there that supports it.

Also, if this is the wrong community, please let me know and I will move it.


Solution

  • It's ARM Thumb code:

       0:   f8df d00c       ldr.w   sp, [pc, #12]   ; 0x10
       4:   f001 f806       bl      0x1014
       8:   4820            ldr     r0, [pc, #128]  ; (0x8c)
       a:   4720            bx      r4
       c:   1531            asrs    r1, r6, #20
       e:   2006            movs    r0, #6
      10:   f5f8 2003                       ; <UNDEFINED> instruction: 0xf5f82003
      14:   4b09            ldr     r3, [pc, #36]   ; (0x3c)
      16:   f01e 0f04       tst.w   lr, #4
      1a:   bf1c            itt     ne
      1c:   f3ef 8009       mrsne   r0, PSP
      20:   4718            bxne    r3
      22:   f3ef 8008       mrs     r0, MSP
      26:   4906            ldr     r1, [pc, #24]   ; (0x40)
      28:   4a06            ldr     r2, [pc, #24]   ; (0x44)
      2a:   4288            cmp     r0, r1
      2c:   d801            bhi.n   0x32
      2e:   4290            cmp     r0, r2
      30:   d802            bhi.n   0x38
      32:   468d            mov     sp, r1
      34:   f04f 2020       mov.w   r0, #536879104  ; 0x20002000
      38:   4718            bx      r3
      3a:   2020            movs    r0, #32
      3c:   7751            strb    r1, [r2, #29]
    ...
    

    The undefined instruction is constant pool data referenced by the instruction at address 0.

    I used the following command to disassemble it:

    arm-linux-gnueabi-objdump -D -b binary -M force-thumb -m arm_any test.bin 
    

    ARM Cortext-M CPUs, which only execute the ARM Thumb instruction set, are very popular in devices that need a fair bit of computing power, but also need to be very energy efficient. This makes Thumb a good place to start on a device like this.