Understanding RiscV objdump

I am examining the objdump of a C file that I have compiled using the following commands:

riscv64-unknown-elf-gcc -O0 -o maxmul.o maxmul.c
riscv64-unknown-elf-objdump -d maxmul.o > maxmul.dump

strangely (or not) the addresses appear not to be aligned on 32-bit words but actually on a 16-bit boundary.

Can anyone please explain me why?

Thanks.

objdump excerpt:

00000000000101da <main>:
   101da:   7155                    addi    sp,sp,-208
   101dc:   e586                    sd  ra,200(sp)
   101de:   e1a2                    sd  s0,192(sp)
   101e0:   0980                    addi    s0,sp,208
   ...

C-code:

int main()
{

  int first[3][3], second[3][3], multiply[3][3];
  int golden[3][3];
  int sum;

  first[0][0] = 1;  first[0][1] = 2;  first[0][2] = 3;
  first[1][0] = 4;  first[1][1] = 5;  first[1][2] = 6;
  first[2][0] = 7;  first[2][1] = 8;  first[2][2] = 9;

  second[0][0] = 9;  second[0][1] = 8;  second[0][2] = -7;
  second[1][0] = -6; second[1][1] = 5;  second[1][2] = 4;
  second[2][0] = 3;  second[2][1] = 2;  second[2][2] = -1;

  golden[0][0] = 6;  golden[0][1] = 24;  golden[0][2] = -2;
  golden[1][0] = 24; golden[1][1] = 69;  golden[1][2] = -14;
  golden[2][0] = 42; golden[2][1] = 1140;  golden[2][2] = -26;

  int i, ii, iii;
  for (i = 0; i < 3; i++) {
      for (ii = 0; ii < 3; ii++) {
          for (iii = 0; iii < 3; iii++) {
              //printf("first[%d][%d] * second[%d][%d] \n",  i, iii, iii, ii);
              //printf("%d * %d (%d,%d)\n", first[i][ii], second[ii][i], i, ii);
              sum +=  first[i][iii] * second[iii][ii];
          }
          //printf("sum = %d\n", sum);
          multiply[i][ii] = sum;
          sum = 0;
      }
  }

 int c, d;
 int err;
 for ( c = 0; c < 3; c++) {
      for ( d = 0; d < 3; d++) {
    //printf("%d\t", multiply[c][d]);
          if (multiply[c][d] != golden[c][d]) {
              fail(golden[c][d], multiply[c][d]);
              err++;
          }
      }

      //printf("\n");
    }
    if (err == 0) {
          pass();
      }
   return 0;
}

Solution

When compiling (or assembling) to RV64GC or RV32GC (or another target that enables the "C" Standard Extension Compressed Instructions), the compiler (or assembler) automatically replaces some instructions with compressed ones.

Non-compressed instructions are encoded in 32 bit, while compressed instructions are encoded in 16 bit.

When a compressed instruction is emitted it changes the alignment for the next instruction. Either from 32 bit to 16 bit or from 16 bit to 32 bit. That means not only 16 bit wide instructions may be aligned to a 16 bit address but also 32 bit wide ones. IOW both types of instructions (compressed and normal) are tightly packed side by side.

By default, objdump -d doesn't explicitly indicate that an instruction is compressed because it uses the same mnemonic as for the uncompressed variant. Although the number of bytes in the displayed raw instruction gives it away (4 vs. 2 bytes).

However, you can tell objdump to use separate mnemonics for compressed instructions such that they are more easily recognizable (those start with c. then), e.g.:

$ riscv64-unknown-elf-objdump -d -M no-aliases rotate

   [..]
   101e4:       00d66533                or      a0,a2,a3
   101e8:       8082                    c.jr    ra

00000000000101ea <rotr>:
   101ea:       00b55633                srl     a2,a0,a1
   [..]

Note that with the switch -M no-aliases pseudo-instructions aren't displayed anymore, but the corresponding instruction(s) instead.