Search code examples
assemblyriscvimmediate-operand

Understand EQU and >> operators on them in RISC-V assembly, with LUI and ADDI


my prof posted this as one of the answers to a homework problem. Can anyone break this down for me? I don't understand what he is doing with CON1 - CON4 and what the >> and 0x0FFF mean.

CON1:   EQU 6000
CON2:   EQU 6245
CON3:   EQU 10000
CON4:   EQU 10245
A:  DM 4                         ; DM is Define Memory

    addi    t1,  x0, A           ; t1 = &A

    lui t0,  (CON1>>12) + ((CON1 & 0x0800)>>11)
    addi    t0,  t0, CON1&0xFFF
    sd  t0,  0(t1)            // Cut and paste from last question of Quiz1
                                      // Blank line between groups of statements
    lui t0,  (CON2>>12) + ((CON2 & 0x0800)>>11)
    addi    t0,  t0, CON2&0xFFF
    sd  t0,  8(t1)

    lui t0,  (CON3>>12) + ((CON3 & 0x0800)>>11)
    addi    t0,  t0, CON3&0xFFF
    sd  t0,  16(t1)

    lui t0,  (CON4>>12) + ((CON4 & 0x0800)>>11)
    addi    t0,  t0, CON4&0xFFF
    sd  t0,  24(t1)
                                      // We need this to avoid the NO INSTRUCTION error
    ebreak x0, x0, 0              ; Suspend program.

Any help would be appreciated thank you. We are using RISC-V


Solution

  • In the RISC-V base instruction set, each instruction is encoded in 32 bits. That means the space for immediate operands is limited to a few bits. Thus, to get a larger constant into a register (which is with RV32G/RV64G also 32 or 64 bit wide) you need to split it and move the parts with multiple instructions, i.e. 2 with RV32G and up to 8 with RV64G.

    With 32 bit RISC-V (RV32G), larger constants can be loaded with the load upper immediate (lui) and add immediate (addi) instruction. The immediate operand of lui is 20 bit wide while addi allows for an immediate operand of 12 bit. Thus, they are sufficient to load constants that use up to 32 bits.

    lui sign-extends its immediate operand and left-shifts it by 12 bits and loads the result into the destination register. Hence its name. addi also sign-extends its immediate operand before adding it.

    Thus, with RV32G, to load a larger constant with lui followed by addi one has to take the upper 20 bits, logical-right-shift them by 12 bits such that the 12 bit left-shift by lui is cancelled out. Followed by the masking of the lower 12 bits to get the operand for addi.

    This is sufficient if addi doesn't sign-extends its immediate operand. If it does because the highest bit is set to 1 we have to increase the lui operand such that the superfluous sign bits are zeroed-out again in the addition.

    Say we denote the high part of our constant x with h, the low part with l, since RISC-V implements two's complement and arithmetic wraps on register overflow, we can use modular arithmetic to see that:

         h + l = x                             # taking register widths into account:
     => (h + l) % 2**32  = x % 2**32           # in case l is sign extended:
     => (h + l + e + c) % 2**32  = x % 2**32   # replace e with the additional sign bits:
    <=> (h + l + 4294963200 + c) % 2**32  = x % 2**32     # eliminate c:
    <=> (h + l + 4294963200 + 4096) % 2**32  = x % 2**32
    <=> (h + l) % 2**32  + (4294963200 + 4096) % 2**32  = x % 2**32
    <=> (h + l) % 2**32  + 0  = x % 2**32
    

    Thus, we have to add 1 to the lui immediate operand (which equals 4096 after being left-shifted by 12 bits) if and only if the immediate operand of addi is sign-extended.

    In your assembly example, the >> denotes a right-shift, << a left-shift and & logical-and. They are used to implement the described splitting and arithmetic, e.g. in

     lui t0,  (CON1>>12) + ((CON1 & 0x0800)>>11)
     addi    t0,  t0, CON1&0xFFF
    

    where CON1 & 0x0800 masks the 12 bit, i.e. the sign bit of the addi immediate operand. If it is set then ((CON1 & 0x0800)>>11) evaluates to 1 and thus cancels out the superfluous sign bits added by the following addi instruction. CON1&0xFFF masks the lowest 12 bits.

    In standard RISC-V assembly all this tedious bit managing can be avoided by just using the load immediate (li) pseudo-instruction, e.g.:

    li     t1, 6245
    

    Which the assembler automatically translates into the optimal instruction sequence (check e.g. with objdump):

    lui    t1, 0x2
    addi   t1, t1,-1947
    

    Alternatively, with the GNU as assembler there are also directives for splitting the operand into the upper and lower part:

    lui    a1, %hi(6245)
    addi   a1, a1, %lo(6245)
    

    Which arguably is also more readable than the clutter in your snippet.

    This also works with symbols in GNU as, e.g.:

    .set CON2, 6245
    
    li    a1, 6245
    
    lui   a2, %hi(CON2)
    addi  a2, a2, %lo(CON2)
    
    li    a3, CON2
    
    # => a1 == a2 == a3