I am trying to figure out the encoding for unconditional JMPs on SPARC, i.e the JMP. After disassembling a few binaries.
In my IDA disassembly the encoding for JMP %g1 is:
81 c0 40 00
And the encoding for jmp %g4 is:
81 c1 00 00
Digging through the spark manuals, I can't seem to find a record of how this is encoded. I am also confused as to why IDA refers to a "JMP" as opposed to the "JMPL" in the docs.
The JMPL encoding recommendations given in the SPARC9 manual are a little arcane to me and I struggle with what they are getting at:
10-RD-OP3-RS1-i-[-]-rs2
or
10-RD-OP3-RS1-i-siMM3
"If either of the low-order two bits of the jump address is nonzero, a mem_address_not_aligned exception occurs"
Well, I'm not sure how that squares with the instruction that IDA found. Can someone break down how this maps to JMP %g1? How would this change for JMP %g2?
Note: This is report for reverse engineering stack exchange, I'm going to delete whichever one gets a good answer first. I've had better luck with this kind of question on SO lately.
jmp
is an alias for jmpl
with a destination register of %g0
, ie. address discarded. The manual specifies that OP3
is fixed at 11 1000
. The i
bit selects between the two encoding variants. The single register operand can be encoded in either way, your example uses i=0
meaning it's the jmpl %rs1+%rs2, %g0
form. We can now easily produce the machine code:
10 (fixed)
00000 (rd=%g0)
11 1000 (OP3, fixed)
00001 (rs1=%g1)
0 (i)
00000000 (ignored)
00000 (rs2=%g0)
Concatenating all of these gives 1000 0001 1100 0000 0100 0000 0000 0000 = 81 C0 40 00
You could also encode jmp %g1
as jmp %g1+0
which would use the other variant, with i=1
, and produces 81 C0 60 00
. Another possibility is jmp %g0+%g1
which gives 81 C0 00 01
.
To get jmp %g2
you obviously change the rs1
field.