Search code examples
cloopsrecursionassemblytail-recursion

Replacing recursion with a while loop


For performance reasons, are recursions ever replaced with while loops? I know the code looks much uglier, but let's take the following example:

void print_countdown(long long n) {
    if (n!=0) {
        printf("Placeholder for a function\n");
        print_countdown(n-1);
    } else 
        printf("Done!\n");
}

If the number is 1 million, then this will have an overhead of:

  • 1 million copies of the n var (saved from rdi, put back in rdi for the recursive call, if the recursive work includes a function call, otherwise it can stay in rdi.)
  • call func
  • push rbp ... pop function prologue or something else for stack alignment, depending on optimization level and compiler choices.

In other words, while the code is readable, for anything like > 1000 loops, this seems to be better rewritten to something like:

void print_countdown(long long n) {
    while (n < MAX_LOOPS) {
        if (n!=0) {
            printf("Placeholder for a function\n");
            n = n-1;     
        } else
            printf("Done!");
    }
    
}

The assembly code (Godbolt) also looks much, much simpler in the while format -- ~20 lines vs ~40 lines.

Is it common to do this kind of loop-rewriting, and are there ever times in a recursive function call where it cannot be re-written to a loop ?


Solution

  • Yep.

    Proof: https://godbolt.org/z/EqbnfY

    This code

    #include <stdio.h>
    
    void print_countdown(long long n) {
        if (n!=0) {
            // do_something
            print_countdown(n-1);
        } else 
            printf("Done!\n");
    }
    

    Generates this assembly with the x86-64 Clang compiler and -O2 optimizations (the compiler even generates the comments too!)

    print_countdown(long long):                   # @print_countdown(long long)
            mov     edi, offset .Lstr
            jmp     puts                            # TAILCALL
    .Lstr:
            .asciz  "Done!"
    

    Other compilers generate similar results.