Search code examples
ubuntugdb

Is it possible to change in gdb whether immediates in disassembly are displayed in hex vs. decimal?


When I use the "disas" instruction in gdb to disassemble a RISC-V instruction, it shows me this:

li  a0,64

I would instead like it to show a hexadecimal immediate, rather than decimal. E.g.

li  a0,0x40

Is it possible to do this in GBD for RISC-V?

(I'm guessing that gdb forces decimal, so it can show signed negative values in decimal rather than hex (which might confuse some people), but I'm asking just in case it's possible to configure this.)


Solution

  • No.

    The risc-v disassembler that GDB uses does not have an option to format immediates as hex.

    However, it might be possible to do what you want using the current development (unreleased) version of GDB.

    Since GDB 13 there is a Python API that wraps the disassembler. The version that exists in GDB 13 is pretty basic, and doesn't help you as much as you might like. However, this API was extended in the development branch, so it should be possible to write a Python script that reformats the immediates how you'd like.

    Here's an initial script, save this into a file riscv-dis.py:

    REMEMBER this will only work with GDB 14, or using the current development branch of GDB.

    import gdb.disassembler
    
    
    class riscv_dec_imm_disasm(gdb.disassembler.Disassembler):
        def __init__(self):
            super().__init__("riscv_dec_imm_disasm")
    
        def __call__(self, info):
            result = gdb.disassembler.builtin_disassemble(info)
            parts = []
            for p in result.parts:
                if (
                    isinstance(p, gdb.disassembler.DisassemblerTextPart)
                    and (
                        p.style == gdb.disassembler.STYLE_IMMEDIATE
                        or p.style == gdb.disassembler.STYLE_ADDRESS_OFFSET
                    )
                    and p.string[0:2] != "0x"
                ):
                    # An immediate part that does not have '0x' prefix, so
                    # should be decimal.  Lets reformat it as hex.
                    #
                    # TODO: This assumes that all immediates are 8-bits,
                    # which isn't going to be correct.  I guess you'll
                    # need to map from mnemonic to immediate size in order
                    # to do this correctly.
                    v = int(p.string)
                    parts.append(info.text_part(p.style, ("0x%x" % (v & 0xFF))))
                else:
                    # Don't change this part.
                    parts.append(p)
            return gdb.disassembler.DisassemblerResult(length=result.length,
                                                       parts=parts)
    
    
    gdb.disassembler.register_disassembler(riscv_dec_imm_disasm())
    

    Then we can use it in a GDB session like this:

    $ gdb -q /tmp/hello.rv32imc.x 
    Reading symbols from /tmp/hello.rv32imc.x...
    (gdb) disassemble main 
    Dump of assembler code for function main:
       0x000101aa <+0>:     add sp,sp,-16
       0x000101ac <+2>:     sw  ra,12(sp)
       0x000101ae <+4>:     sw  s0,8(sp)
       0x000101b0 <+6>:     add s0,sp,16
       0x000101b2 <+8>:     lui a5,0x18
       0x000101b4 <+10>:    add a0,a5,1756 # 0x186dc
       0x000101b8 <+14>:    jal 0x1040e <puts>
       0x000101ba <+16>:    li  a7,7
       0x000101bc <+18>:    li  a6,6
       0x000101be <+20>:    li  a5,5
       0x000101c0 <+22>:    li  a4,4
       0x000101c2 <+24>:    li  a3,3
       0x000101c4 <+26>:    li  a2,2
       0x000101c6 <+28>:    li  a1,1
       0x000101c8 <+30>:    li  a0,0
       0x000101ca <+32>:    jal 0x1014c <call_me>
       0x000101cc <+34>:    li  a5,0
       0x000101ce <+36>:    mv  a0,a5
       0x000101d0 <+38>:    lw  ra,12(sp)
       0x000101d2 <+40>:    lw  s0,8(sp)
       0x000101d4 <+42>:    add sp,sp,16
       0x000101d6 <+44>:    ret
    End of assembler dump.
    (gdb) source riscv-dis.py
    (gdb) disassemble main 
    Dump of assembler code for function main:
       0x000101aa <+0>:     add sp,sp,0xf0
       0x000101ac <+2>:     sw  ra,0xc(sp)
       0x000101ae <+4>:     sw  s0,0x8(sp)
       0x000101b0 <+6>:     add s0,sp,0x10
       0x000101b2 <+8>:     lui a5,0x18
       0x000101b4 <+10>:    add a0,a5,0xdc
       0x000101b8 <+14>:    jal 0x1040e <puts>
       0x000101ba <+16>:    li  a7,0x7
       0x000101bc <+18>:    li  a6,0x6
       0x000101be <+20>:    li  a5,0x5
       0x000101c0 <+22>:    li  a4,0x4
       0x000101c2 <+24>:    li  a3,0x3
       0x000101c4 <+26>:    li  a2,0x2
       0x000101c6 <+28>:    li  a1,0x1
       0x000101c8 <+30>:    li  a0,0x0
       0x000101ca <+32>:    jal 0x1014c <call_me>
       0x000101cc <+34>:    li  a5,0x0
       0x000101ce <+36>:    mv  a0,a5
       0x000101d0 <+38>:    lw  ra,0xc(sp)
       0x000101d2 <+40>:    lw  s0,0x8(sp)
       0x000101d4 <+42>:    add sp,sp,0x10
       0x000101d6 <+44>:    ret
    End of assembler dump.
    (gdb) 
    

    There's one big problem with this -- in order to display negative immediates as 0xfff... I need to mask the immediate to size, which means I need to know how big the immediate is. The only way to know that would be to look up the immediate size based on (I guess) the instruction mnemonic. In the above script I just assume all immediates are 8-bits.