Search code examples
assemblyx86-64undefined-behaviorcalling-convention

Why does rax and rdi work the same in this situation?


I have made this code :

global  strlen
    ; int strlen(const char *string);
strlen:
    xor     rcx, rcx

retry:
    cmp byte    [rdi + rcx], 0
    je      result
    inc     rcx
    jmp     retry

result:
    mov     rax, rcx
    ret

And this is how I test it :

#include <stdio.h>

int main(int argc, char **argv)
{
    char* bob = argv[1];
    printf("%i\n", strlen(bob));
    return 0;
}

This is a working strlen, no problem here but I've noticed that I can switch the rdi in the first line of the retry block for a rax without it changing anything, I don't know if this is normal behavior. which of those values should I keep ?


Solution

  • It's just bad luck.

    GCC 8, without optimisations, uses rax as an intermediary location to move argv[1] to bob and to move the latter into the first parameter of strlen:

      push rbp
      mov rbp, rsp
      sub rsp, 32
    
      mov DWORD PTR [rbp-20], edi             ;argc
      mov QWORD PTR [rbp-32], rsi             ;argv
    
      mov rax, QWORD PTR [rbp-32]             ;argv
      mov rax, QWORD PTR [rax+8]              ;argv[1]
      mov QWORD PTR [rbp-8], rax              ;bob = argv[1]
    
      mov rax, QWORD PTR [rbp-8]
      mov rdi, rax
      call strlen                             ;strlen(bob)
    
      mov esi, eax
      mov edi, OFFSET FLAT:.LC0
      mov eax, 0
      call printf
    
      mov eax, 0
      leave
      ret
    

    This is just bad luck, it's not a documented behaviour, in fact it fails if you use a string literal:

    printf("%i\n", strlen("bob"));
    
      mov edi, OFFSET FLAT:.LC1
      call strlen                     ;No RAX here
    
      mov esi, eax
      mov edi, OFFSET FLAT:.LC0
      mov eax, 0
      call printf
    

    The document specifying how to parameters are passed to function is your OS ABI, read more in this answer.


    GCC generates "dumb" code that uses the registers a lot when the optimisations are disabled, this eases the debugging (both of the GCC engine and the program compiled) and essentially mimics a beginners: first the variable is read from memory and put in the first free register (one problem solved), then it is copied in the right register (another one gone) and finally the call is made.
    GCC just picked up the first free register, in this simple program there is no registers pressure and rax is always picked up.