Search code examples
recursionassemblyx86nasmlocal-variables

NASM ways to create symbols local to a function and how they behave in case of recursion


From what I have learned, I could very well be wrong, to have local variables for a function in NASM we can either:

  • Use an offset with the function's base pointer and space on the stack to act as local variables.

  • Declare symbols after the function label in an additional section .data or .bss as shown below.

    section .text
    _FindLongestString:
    
    section .data
     .length dd 0
     .address dd 0
    
     push ebp
     mov  ebp, esp
    
     ; more instructions
    
     mov  esp, ebp
     pop ebp
    

If we use the second method can it be recursive? Or would each iteration of the function be accessing the same memory space when they reference a local symbol like .length?

I imagine the first method would not have a problem with recursion since each iteration would reserve space on the stack for its own variables.


Solution

  • Static storage in .data only exists once, where you put it. In this case, you put your code inside section .data since you switched to it and didn't switch back to section .text.

    .length dd 0 is like C static int length; - all invocations of your function use the same 4 bytes of static storage.

    If you want unique space for local variables in each invocation of re-entrant code (such as a recursive and/or thread-safe function), use stack space like C compilers would for int length;. (Without static, so automatic storage-class.)


    Switching to .data or .bss and back and using only . local labels is a way to get their names scoped to the function while reserving static storage. This is pretty much equivalent to C static int length; at function scope. I'd put the static data stuff after the function body, but that's just a style choice since each section has its own current position.


    NASM processes the source file in source order. Not in execution order: remember it's just an assembler not an interpreter. Assembling a call or jmp instruction into machine code doesn't make it go back and re-assemble older lines.

    Each NASM source line tells NASM to append some bytes into the current section of the object file. This is what assembling is. db 0x90 is exactly identical to nop, for example. NASM just knows how to encode instructions, it doesn't care how they'll execute.

    Loops and recursion are a run-time thing, happening long after assembling and linking is done.

    In terms of a train track or race-car analogy, the assembler builds the tracks, the CPU follows them. (Not a perfect analogy here because it's fine for multiple threads to be executing the same function, just not to be using the same static storage unless that's intended.)