We know that jal
specifies a 21-bit offset. However, it does not encode a 21-bit offset but a 20-bit one. The reason is that the least significant bit of an address is always zero because the smallest possible RISC-V instruction is 2 bytes, so this bit is not encoded in the instruction.
By encoding the offset this way it can provide a jumping range of ±1MiB. If jal
did encode the LSB, it would offer just a ±512KiB jumping range.
However, the jalr
instruction, which specifies a 12-bit offset, does encode the LSB. This reduces the jumping range to ±2kiB (instead of ±4kiB). I know that jalr
uses the I-type format, which is the same as addi
and the LSB of the immediate has to be encoded for this kind of instructions. However, I see no reason why the least significant bit has to be encoded for jalr
.
JALR
is used for two relatively distinct purposes:
For the former, indirect branches, the immediate value is always 0, which is to say that effectively no immediate is used at all!
For the latter, this instruction is used in conjunction with AUIPC
, which forms the upper 20 bits of pc-relative addressing — and JALR
is used then in conjunction to form the lower 12-bits, for a total pc-relative offset of 32-bits.
However, AUIPC
is used both for far branches, as well as for pc-relative data access. Thus, they both share the 12-bit offset — the load/store's using their 12-bit immediate, and the JALR
following suit by also using a 12-bit immediate field just like loads & stores. The designers chose to share AUIPC
rather than to have a two different AUIPC
for these two uses (reference from code-to-code vs. reference from code-to-data).
In summary, the range of JALR
is mostly not important, as long as it can supply the remaining 12-bits to complement AUIPC
's 20 bits. Sure there are other approaches, but this does have the advantage of reusing and requiring only one AUIPC
instruction.