Search code examples
assemblyencodingcompiler-constructionsparc

What is the encoding format for unconditional Jumps on SPARC/SPARC64?


I am trying to figure out the encoding for unconditional JMPs on SPARC, i.e the JMP. After disassembling a few binaries.

In my IDA disassembly the encoding for JMP %g1 is:

81 c0 40 00 

And the encoding for jmp %g4 is:

81 c1 00 00

Digging through the spark manuals, I can't seem to find a record of how this is encoded. I am also confused as to why IDA refers to a "JMP" as opposed to the "JMPL" in the docs.

The JMPL encoding recommendations given in the SPARC9 manual are a little arcane to me and I struggle with what they are getting at:

10-RD-OP3-RS1-i-[-]-rs2 

or

10-RD-OP3-RS1-i-siMM3

"If either of the low-order two bits of the jump address is nonzero, a mem_address_not_aligned exception occurs"

Well, I'm not sure how that squares with the instruction that IDA found. Can someone break down how this maps to JMP %g1? How would this change for JMP %g2?

Note: This is report for reverse engineering stack exchange, I'm going to delete whichever one gets a good answer first. I've had better luck with this kind of question on SO lately.


Solution

  • jmp is an alias for jmpl with a destination register of %g0, ie. address discarded. The manual specifies that OP3 is fixed at 11 1000. The i bit selects between the two encoding variants. The single register operand can be encoded in either way, your example uses i=0 meaning it's the jmpl %rs1+%rs2, %g0 form. We can now easily produce the machine code:

    10 (fixed)
    00000 (rd=%g0)
    11 1000 (OP3, fixed)
    00001 (rs1=%g1)
    0 (i)
    00000000 (ignored)
    00000 (rs2=%g0)
    

    Concatenating all of these gives 1000 0001 1100 0000 0100 0000 0000 0000 = 81 C0 40 00

    You could also encode jmp %g1 as jmp %g1+0 which would use the other variant, with i=1, and produces 81 C0 60 00. Another possibility is jmp %g0+%g1 which gives 81 C0 00 01.

    To get jmp %g2 you obviously change the rs1 field.