Search code examples
assemblymipsnested-loops

Assembly MIPS: Nested loops


It surely got a bit tricky when I tried to write some code that would print five lines of the asterisk symbol times 4 in each one.

****
****
****
****

So I thought a nested loop could save the day. Boy I was wrong.

So I made an inner loop for the asterisks and an outer loop for the spaces as below:

.text
.globl main
main:
add $t0, $zero, $zero   #i counter for the inner loop
add $t2, $zero, $zero   #j counter for the outer loop

outerloop:

    innerloop:

        slti    $t1, $t0, 4     #while (i<4)
        beq     $t1, $zero, innerexit

        li      $v0, 11         #printf("*");
        la      $a0, '*'
        syscall

        addiu   $t0, $t0, 1     #i++

    j innerloop
    innerexit:

slti    $t3, $t2, 5     #while (j<5)
beq     $t3, $zero, outerexit

li      $v0, 11         #printf("\n");
la      $a0, '\n'
syscall

addiu   $t2, $t2, 1     #j++

j outerloop
outerexit:

li  $v0, 10
syscall

But the output gives me just one line:

****

What's the matter with the outer loop?


Solution

  • The simplest way would be to use the write-string system call N times, with a non-nested loop. (Well arguably making one long string containing all the lines would be even "simpler", but less maintainable, and bad for the size of your program with large N).

    Note the use of down-counters, counting down towards zero so we can bne against $zero. This is idiomatic for asm, and so is putting the conditional branch at the bottom of the loop. Especially for any loop where you know the trip-count is guaranteed to be at least 1. (When that's not the case, you'd normally use a branch outside the loop to skip it if needed.)

    ## Tested, works in MARS 4.5
    .data
    line: .asciiz "****\n"
    
    .text
    .globl main
    main:
       li  $t0, 4     # row counter
       li  $v0, 4     # print string call number
       la  $a0, line
    
     .printloop:
       syscall           # print_string has no return value, doesn't modify v0
       addiu  $t0, $t0, -1
       bnez   $t0,  .printloop           # shorthand for BNE $t0, $zero, .printloop
    
       li  $v0, 10       # exit
       syscall
    

    You could generate the string in a temporary buffer, with a count from a register in a separate loop before the print loop. So you can still support runtime-variable row and column counts, with two sequential loops instead of nested.

    With a buffer aligned by 4, we can store a whole word of 4 characters at once so that loop doesn't have to run as many iterations. (li reg, 0x2a2a2a2a takes 2 instructions but li reg, 0x2a2a only takes one, so going by 2 with sh would make the code smaller).

    .text
    .globl main
    main:
    .eqv    WIDTH, 5
    .eqv    ROWS, 4
    
       addiu  $sp, $sp, -32         # reserve some stack space.    (WIDTH&-4) + 8   would be plenty, but MARS doesn't do constant expressions.
       move   $t0, $sp
       
       addiu  $t1, $sp, WIDTH       # pointer to end of buf = buf + line length., could be a register
       li     $t2, 0x2a2a2a2a           # MARS doesn't allow '****' or even '*' << 8 | '*'
     .makerow:                  # do{
       sw     $t2, ($t0)          # store 4 characters
       addiu  $t0, $t0, 4         # p+=4
       sltu   $t7, $t0, $t1
       bnez   $t7, .makerow     # }while(p < endp);
    # overshoot is fine; we reserved enough space to do whole word stores
       li     $t2, '\n'
       sb     $t2, ($t1)
       sb     $zero, 1($t1)     # terminating 0 after newline.  Unfortunately an sh halfword store to do both at once might be unaligned
    
       move   $a0, $sp
       li     $t0, ROWS
       li     $v0, 4             # print string call number
    
     .printloop:
       syscall                   # print_string has no return value, doesn't modify v0
       addiu  $t0, $t0, -1
       bnez   $t0,  .printloop           # shorthand for BNE $t0, $zero, .printloop.      # }while(--t != 0)
    
    
    ## If you were going to return instead of exit, you'd restore SP:
    #  addiu $sp, $sp, 32
    
       li  $v0, 10       # exit
       syscall
    

    As expected, this prints 5 asterisks on every row.

    *****
    *****
    *****
    *****
    

    Generally (in real systems) a system-call is much more expensive than normal instructions, so preparing a single large buffer with multiple newlines would actually make sense. (The overhead of a system call dwarfs the difference between writing 1 vs. 5 or even 20 bytes, so even though calling print_string instead of print_char is kind of hiding work inside the system call, it's justified.)

    In that case you probably would want nested loops, but with sb / addiu $reg, $reg, 1 pointer-increment instead of syscall. Only make one system call at the very end.

    Or a loop to store all the * characters 4 at a time (for ROWS * COLS / 4 rounded up iterations), then another loop that inserts the \n' newlines where they belong. This lets you get all the data into memory with fewer instructions than doing everything in order 1 byte at a time. (For very large row*col counts, you would probably limit your buffer size to 4 or 8 kiB or something, so your data is still in cache when the kernel's system call handler reads it to copy it to wherever it needs to be.)


    BTW, in C terms, the print char system call is more like putchar('*'), not printf("*"). Note that you're passing it a character by value, not a pointer to a 0-terminated string.