Search code examples
embeddedreverse-engineering

How to understand the operand encoding of Renesas M16C bit instructions?


I'm looking at a small binary targeting the Renesas M16C µC and reading the document "M16C/60, M16C/20, M16C/Tiny Series Software Manual" (https://www.renesas.com/ja/document/mah/m16c60-m16c20-m16ctiny-series-software-manual) to understand how its instructions are encoded. I'm having difficulty understanding section 2.5 which is supposed to describe how the operands of bit-oriented instructions like bset and bclr are implemented.

Focusing on bclr for a moment, it specifies that when the low bits of the instruction are 0000 the resulting operand is bit,R0. However, nowhere in the text or diagram does it specify where the bit part is coming from. Essentially, it looks something like this:

[0111'1110][1000'0000]

Luckily, the binary I have did have instances of bclr using this operand type, and I was able to deduce that the instruction is actually encoded as:

[0111'1110][1000'0000][0000' bit]

which isn't apparent from anywhere in the document that I could see.

Operand code 0110 is specified as encoding [a1], with no explicit bit number. I'm reading this as "take the value of a1, keep the low 3 bits, call that the bit. Take the remaining bits and shift them right 3 bits, that is the byte address, from which the bit should be extracted." Am I correct here?

Finally, the confusion is maximal when looking at the operand codes 1000 and 1010 respectively. The manual states base:8[A0] and bit,base:8[A0] respectively, but I don't have any sample binary to determine exactly how these encodings work. Is A0 treated as a bit address in these cases as well? Is the 8-bit displacement a bit offset, a byte offset off A0, or both?


Solution

  • The relevant chapter in the software manual is chapter 4.2, and specifically for your example instruction bclr the pages 152 to 153. The quick reference page 10 shows the page numbers.

    The instruction is encoded in 2 to 4 bytes, and there is an additional short encoding in 2 bytes:

    bclr:g = [0111'1110] [1000'dest] ([dsp8] | [dsp16])

    bclr:s = [0100'0bit] [dsp8] for addressing mode bit,base:11[SB]

    The encoding for bclr:s is simple, as the bit number and the 8 bit base address are clearly shown. The addressable range*6 is 256 bytes beginning with the address in SB.

    You seem to miss that ([dsp8] | [dsp16]) designates the needed additional bytes that complete the instruction. Your experiment proves this for bit,R0, and therefore "which isn't apparent from anywhere in the document" is not true. ;-) However, it could use some more words and some examples.


    The following is my interpretation and based on decades of development experience. You can use experimental programs to verify.

    This is the table given for bclr:g, cutting the leftmost column, as it does not add information. Instead, I have added a column on the right to show the additional byte(s):

    Addressing mode dest Additional byte(s)
    bit,R0 0000 [0000'bit] *1
    bit,R1 0001 [0000'bit] *1
    bit,R2 0010 [0000'bit] *1
    bit,R3 0011 [0000'bit] *1
    bit,A0 0100 [0000'bit] *1
    bit,A1 0101 [0000'bit] *1
    [A0] 0110 *2
    [A1] 0111 *2
    base:8[A0] 1000 [b7..b4'b3..b0] *3
    base:8[A1] 1001 [b7..b4'b3..b0] *3
    bit,base:8[SB] 1010 [b4..b1'b0bit] *4
    bit,base:8[FB] 1011 [b4..b1'b0bit] *5
    base:16[A0] 1100 [b15..b12'b11..b8][b7..b4'b3..b0] *3
    base:16[A1] 1101 [b15..b12'b11..b8][b7..b4'b3..b0] *3
    bit,base:16[SB] 1110 [b12..b9'b8..b5][b4..b1'b0bit] *4
    bit,base:16 1111 [b12..b9'b8..b5][b4..b1'b0bit] *6

    Notes:

    *1 As the registers are 16 bits wide, only 4 bits in the additional byte are used to address the bit. This is indeed not clearly stated.

    *2 No additional value is necessary, the bit address*6 is completely in An. This is equivalent to base:X[An] with base = 0.

    *3 The given base address is a byte address, and the value in An is the (16 bit) bit offset, giving the bit address*6.

    *4 The given bit number is a bit offset and the given base is an unsigned byte offset to the byte address in SB, giving the bit address*6.

    *5 The given bit number is a bit offset and the given base is a signed byte offset to the byte address in FB, giving the bit address*6.

    *6 Only bytes at addresses 0000016 to 01FFF16 can be accessed, as explained at multiple places. Most probably this is a limitation of the MCU's hardware. For example, the internal adder used for address calculations might be only 16 bits wide, or the address logic for bit access has only a 16 bit input path. Anyway, this is not relevant for your issue.


    To answer your specific questions:

    "take the value of a1, keep the low 3 bits, call that the bit. Take the remaining bits and shift them right 3 bits, that is the byte address, from which the bit should be extracted." Am I correct here?

    Yes, ignoring the mix of A0and A1.

    [...] when looking at the operand codes 1000 and 1010 respectively. The manual states base:8[A0] and bit,base:8[A0] respectively, [...]

    Here you have again a typo or reading error: The operand code 1010 means bit,base:8[SB], using the static base register.

    [for operand code 1000 meaning base:8[A0]] Is A0 treated as a bit address in these cases as well? Is the 8-bit displacement a bit offset, a byte offset off A0, or both?

    The immediate base is a byte address. The value in A0 is a (16 bit) bit offset.

    [for operand code 1010 meaning bit,base:8[SB], replacing A0 by SB] Is SB treated as a bit address in these cases as well? Is the 8-bit displacement a bit offset, a byte offset off SB, or both?

    The immediate bit is a bit offset and the immediate base is a byte offset. The value in SB is a byte address.