Search code examples
assemblyx86-64instructions

What does this set of instructions do?


What does this set of instructions do?

   7ffff7a97759    mov    0x33b780(%rip),%rax        # 0x7ffff7dd2ee0
   7ffff7a97760    mov    (%rax),%rax
   7ffff7a97763    test   %rax,%rax
   7ffff7a97766    jne    0x7ffff7a9787a

I can't figure out what these instructions would do, can someone explain ?


Solution

  • Going one step at a time...

    7ffff7a97759    mov    0x33b780(%rip),%rax        # 0x7ffff7dd2ee0
    

    This:

    1. Takes the address in rip, and adds 0x33b780 to it. At this point, rip contains the address of the next instruction, which is 0x7ffff7a97760. Adding 0x33b780 to that gives you 0x7ffff7dd2ee0, which is the address in the comment.

    2. It copies the 8 byte value stored at that address into rax.

    Let's agree to call this 8 byte value "the pointer". Based on the value of the address, 0x7ffff7dd2ee0 is almost certainly a location on the stack.

    7ffff7a97760    mov    (%rax),%rax
    

    This copies the 8 byte value stored at the address in the pointer into rax.

    7ffff7a97763    test   %rax,%rax
    

    This performs a bitwise AND of rax with itself, discarding the result, but modifying the flags.

    7ffff7a97766    jne    0x7ffff7a9787a
    

    This jumps to location 0x7ffff7a9787a if the result of that bitwise AND is not zero, in other words, if the value stored in rax is not zero.

    So in summary, this means "find the 8 byte value stored at the address contained in the pointer indicated by rip plus 0x33b780, and if that value is not zero, jump to location 0x7fff7a9787a". For instance, in C terms, the pointer stored at 0x7ffff7dd2ee0 might be an long *, and this code checks whether the long that it points to contains 0.

    Its equivalent in C might be something like:

    long l = 0;
    long * p = &l;   /*  Assume address of p is 0x7ffff7dd2ee0  */
    
    
    /*  Assembly instructions in your question start here  */
    
    if ( *p == 0 ) {
        /*  This would be the instruction after the jne  */
        /*  Do stuff  */
    }
    
    /*  Location 0x7ffff7a9787a would be here, after the if block  */
    /*  Do other stuff  */
    

    Here's a full program showing the use of this construct, the only difference being we find our pointer with reference to the frame pointer, rather than to the instruction pointer:

    .global _start
    
            .section .rodata
    
    iszerostr:      .ascii  "Value of a is zero\n"
    isntzerostr:    .ascii  "Value of a is not zero\n"
    
            .section .data
    
    a:      .quad   0x00                    #  We'll be testing this for zero...
    
            .section .text
    
    _start:
            mov     %rsp, %rbp              #  Initialize rbp
            sub     $16, %rsp               #  Allocate stack space
            lea     (a), %rax               #  Store pointer to a in rax...
            mov     %rax, -16(%rbp)         #  ...and then store it on stack
    
            #  Start of the equivalent of your code
    
            mov     -16(%rbp), %rax         #  Load pointer to a into rax
            mov     (%rax), %rax            #  Dereference pointer and get value
            test    %rax, %rax              #  Compare pointed-to value to zero
            jne     .notzero                #  Branch if not zero
    
            #  End of the equivalent of your code
    
    .zero:
            lea     (iszerostr), %rsi       #  Address of string
            mov     $19, %rdx               #  Length of string
            jmp     .end
    
    .notzero:
            lea     (isntzerostr), %rsi     #  Address of string
            mov     $24, %rdx               #  Length of string
    
    .end:
            mov     $1, %rax                #  write() system call number
            mov     $1, %rdi                #  Standard output
            syscall                         #  Make system call
    
            mov     $60, %rax               #  exit() system call number
            mov     $0, %rdi                #  zero exit status
            syscall                         #  Make system call
    

    with output:

    paul@thoth:~/src/asm$ as -o tso.o tso.s; ld -o tso tso.o
    paul@thoth:~/src/asm$ ./tso
    Value of a is zero
    paul@thoth:~/src/asm$ 
    

    Incidentally, the reason for calculating an offset based on the instruction pointer is for improving the efficiency of position independent code, which is necessary for shared libraries. Hard coding memory addresses and shared libraries don't mix so well, but if you know code and data will always at least be the same distance apart, then referencing code and data via the instruction pointer gives you an easy way to produce relocatable code. Without that ability, it's usually necessary to have a layer of indirection, since relative branches are typically limited in range.