Search code examples
assemblyx86x86-64nasmyasm

Woes of implementing selection sort in x86 NASM or YASM Assembly


I am attempting to implement a selection sort of an array in NASM that runs on 64-bit Linux.

The array is declared as:

section .bss
strbuf resb 10
small  resb  1  ; current minimum value

The sort algorithm itself is quite simple, but I feel limited by two things: the number of registers available, and the inability to swap immediates (i.e. brackets)

I need to keep track of the unsorted array boundary, its index, the location of the current smallest value, and its index. That's four registers already. I also need to keep track of two loop counters, one for the outer loop that represents the unsorted array boundary, and one for the iteration on each pass (i.e. inner loop). That's another two, totalling six registers.

As immediates cannot be moved to each other such as mov [var1], [var2] I need to use a register as a temporary placeholder whenever I need to swap two elements. This quickly grows unwieldy in terms of tracking which registers hold what information!

Below is my attempt so far. Please note that this is non-working code that triggers a segmentation fault. But maybe you can perceive what I am trying to do and point out where my approach is going wrong.

I do not wish to use macros that simplify constructs, such as those that provide .IF and .ELSE.

; ====== Sorting begins here ======
sorting:
    mov edi, strbuf  ; outer loop pointer
    mov esi, strbuf  ; inner loop pointer
    mov eax, 0  ; inner loop counter
    mov ebx, 0  ; outer loop counter

innerloop:
    ; store the value of first element in [small]
    mov edx, [esi]
    mov [small], edx

    ; compare the current small value with the value pointed by esi
    mov edx, [esi]  
    cmp [small], edx

    jg  new_small
    inc esi
    inc eax
    cmp eax, 9
    jle innerloop
    cmp eax, 9
    jg  innerloop_done

new_small:
    mov [small], edx  ; save the new small value
    mov ecx, esi      ; save its index in ecx
    inc esi
    inc eax
    cmp eax, 9

    jle     innerloop

innerloop_done:
    ; When the inner loop is completed...
    ; First, do the swap
    push    rax
    mov eax, [edi]
    mov edx, [small]
    mov [ecx], edx
    pop rax

    inc edi  ; move the outer loop pointer forward
    inc esi  ; move the inner loop pointer forward
    inc ebx  ; increment the outer loop counter (the unsorted array becomes smaller)
    inc eax  ; increment the inner loop counter (same reason as above)
    cmp ebx, 9

    jle innerloop   

; ====== Sorting ends here ======

Solution

  • For 64-bit code there are 16 general purpose registers: RAX, RBX, RCX, RDX, RSI, RDI, RSP, RBP, R8, R9, R10, R11, R12, R13, R14, R15.

    Of these, RSP has a special purpose and can only be used for that purpose (current stack address). The RBP register is typically used by compilers for keeping track of the stack frame (excluding "-fomit-frame-pointer" possibilities), but you aren't a compiler and can use it for anything you like.

    This means that out of 15 registers that you could be using, you're only using 6 of them.

    If you actually did run out of registers, then you could shift some things to stack space. For example:

    foo:
        sub rsp,8*5        ;Create space for 5 values
    %define .first   rsp
    %define .second  rsp+8
    %define .third   rsp+8*2
    %define .fourth  rsp+8*3
    %define .fifth   rsp+8*4
    %define .first   rsp+8*5
    
        mov [.first],rax
        mov [.second],rax
        mov rax,[.first]
        add rax,[.second] 
        mov [.third],rax
    ...
        add rsp,8*5        ;Clean up stack
        ret
    

    Hopefully you can see that you could have hundreds of values on the stack, and use a few registers for (temporarily) holding those values if you need to. Normally you'd work out which values are used most often (e.g. in the inner loop) and try to use registers for them and use the stack for the least frequently used variables. However, for 64-bit code (where there are 8 more registers you can use) it's very rare to run out of registers, and if you do it's probably a sign that you need to split the routine into multiple routines.