Search code examples
if-statementassemblyconditional-statementsmips

How can I implement if(x >= '0' && x <= '9') range checks like isdigit in MIPS?


I have written the following function to check whether a character is a digit or not:

# IsDigit - tests a if a character a digit or not 
# arguments:
#   $a0 = character byte
# return value:
#   $v0 =   1 - digit
#           0 - not a digit
IsDigit:
    lb $t0, ($a0) # obtain the character
    li $t1, 48 # '0' - character
    li $t2, 57 # '9' - character
    bge $t0, $t1, condition1
condition1:
    ble $t0, $t2, condition2

    li $v0, 0
    j return

condition2:
    li $v0, 1

return:                                                                                 
    # return 
    jr $ra 

Is there any better way to do or write this?

Edit: The following is the version-2

IsDigit:
    lb $t0, ($a0) # obtain the character
    li $t1, 48 # '0' - character
    li $t2, 57 # '9' - character
    bge $t0, $t1, condition1

    j zero

condition1: 

    ble $t0, $t2, condition2

zero:
    li $v0, 0
    j return

condition2:
    li $v0, 1
    j return

return:                                                                                 
    # return 
    jr $ra 

Edit-2: the following is version-3

IsDigit:
    lb $t0, ($a0) # obtain the character
    li $t1, 48 # '0' - character
    li $t2, 57 # '9' - character


    bge $t0, $t1, con1_fulfilled #bigger tha or equal to 0  
    j con1_not_fulfilled

con1_fulfilled:
    ble $t0, $t2, con2_fullfilled #less than or equal to 9
    j con2_not_fulfilled

con2_fullfilled:
    li $v0, 1
    j return

con1_not_fulfilled:
con2_not_fulfilled:
    li $v0, 0

return:                                                                     
    # return 
    jr $ra 

Solution

  • In the general case, you use 2 branches that go to past the if() body. If either one is taken, the if body doesn't run. In assembly, you usually want to use the negation of the C condition, because you're jumping past the loop body so it doesn't run. Your later version does it backwards so also need unconditional j instructions, making your code extra complicated.

    The opposite of <= (le) is > (gt). For C written to use inclusive ranges (le and ge), asm using the same numerical values should branch on the opposite conditions using exclusive ranges (that exclude the equal case). Or you can adjust your constants and bge $t0, '9'+1 or whatever, which can be useful right at the end of what fits into a 16-bit immediate.

    # this does assemble with MARS or clang, handling pseudo-instructions
    # and I think it's correct.
    IsDigit:
        lb  $t0, ($a0)    # obtain the character
    
        blt   $t0, '0', too_low    # if(   $t0 >= '0'  
        bgt   $t0, '9', too_high   #    && $t0 <= '9')
          # fall through into the if body
        li    $v0, 1
        jr    $ra                    # return 1
    
    too_low:
    too_high:                    # } else {
        li    $v0, 0
    
    #end_of_else:
        jr    $ra                # return 0
    

    If this wasn't at the end of a function, you could j end_of_else from the end of the if body to jump over the else block. Or in this case, we could have put the li $v0, 0 ahead of the first blt, to fill the load delay slot instead of stalling the pipeline. (Of course a real MIPS also has branch-delay slots, and you can't have back-to-back branches. But bgt is a pseudo-instruction anyway so there aren't wouldn't really be back-to-back branches.)

    Also, instead of jumping to a common jr $ra, I simply duplicated the jr $ra into the other return path. If you had more cleanup to do, you might jump to one common return path. Otherwise tail duplication is a good thing to simplify the branching.


    In this specific case, your conditions are related: you're doing a range-check so you only need 1 sub and then 1 unsigned-compare against the length of the range. See What is the idea behind ^= 32, that converts lowercase letters to upper and vice versa? for more about range-checks on ASCII characters.

    And since you're returning a boolean 0/1, you don't want to branch at all, but rather use sltu to turn a condition into a 0 or 1 in a registers. (This is what MIPS uses instead of a FLAGS register like x86 or ARM). Instructions like ble between two registers are pseudo-instructions for slt + bne anyway; MIPS does have blez and bltz in hardware, as well as bne and beq between two registers.


    And BTW, the comments on your IsDigit don't match the code: they say that $a0 is a character, but actually you're using $a0 as a pointer to load a character. So you're passing a char by reference for no apparent reason, or passing a string and taking the first character.

    # IsDigit - tests a if a character a digit or not 
    # arguments:
    #   $a0 = character byte (must be zero-extended, or sign-extended which is the same thing for low ASCII bytes like '0'..'9')
    # return value:
    #   $v0 = boolean: 1 -> it is an ASCII decimal digit in [0-9]
    
    IsDigit:
        addiu   $v0, $a0, -'0'            # wraps to a large unsigned value if below '0'
        sltiu   $v0, $v0, 10              # $v0 = bool($v0 < 10U)  (unsigned compare)
        jr      $ra
    

    MARS's assembler refuses to assemble -'0' as an immediate, you have to write it as -48 or -0x30. clang's assembler has no problem with addiu $v0, $a0, -'0'.

    If you write subiu $v0, $a0, '0', MARS constructs '0' using a braindead lui+ori, because it's very simplistic for extended pseudo-instructions that most assemblers don't support. (MIPS doesn't have a subi instruction, only addi/addiu, both of which take sign-extended immediates.)