What does this set of instructions do?
7ffff7a97759 mov 0x33b780(%rip),%rax # 0x7ffff7dd2ee0
7ffff7a97760 mov (%rax),%rax
7ffff7a97763 test %rax,%rax
7ffff7a97766 jne 0x7ffff7a9787a
I can't figure out what these instructions would do, can someone explain ?
Going one step at a time...
7ffff7a97759 mov 0x33b780(%rip),%rax # 0x7ffff7dd2ee0
This:
Takes the address in rip
, and adds 0x33b780
to it. At this point, rip
contains the address of the next instruction, which is 0x7ffff7a97760
. Adding 0x33b780
to that gives you 0x7ffff7dd2ee0
, which is the address in the comment.
It copies the 8 byte value stored at that address into rax
.
Let's agree to call this 8 byte value "the pointer". Based on the value of the address, 0x7ffff7dd2ee0
is almost certainly a location on the stack.
7ffff7a97760 mov (%rax),%rax
This copies the 8 byte value stored at the address in the pointer into rax
.
7ffff7a97763 test %rax,%rax
This performs a bitwise AND of rax
with itself, discarding the result, but modifying the flags.
7ffff7a97766 jne 0x7ffff7a9787a
This jumps to location 0x7ffff7a9787a
if the result of that bitwise AND is not zero, in other words, if the value stored in rax
is not zero.
So in summary, this means "find the 8 byte value stored at the address contained in the pointer indicated by rip
plus 0x33b780
, and if that value is not zero, jump to location 0x7fff7a9787a
". For instance, in C terms, the pointer stored at 0x7ffff7dd2ee0
might be an long *
, and this code checks whether the long
that it points to contains 0
.
Its equivalent in C might be something like:
long l = 0;
long * p = &l; /* Assume address of p is 0x7ffff7dd2ee0 */
/* Assembly instructions in your question start here */
if ( *p == 0 ) {
/* This would be the instruction after the jne */
/* Do stuff */
}
/* Location 0x7ffff7a9787a would be here, after the if block */
/* Do other stuff */
Here's a full program showing the use of this construct, the only difference being we find our pointer with reference to the frame pointer, rather than to the instruction pointer:
.global _start
.section .rodata
iszerostr: .ascii "Value of a is zero\n"
isntzerostr: .ascii "Value of a is not zero\n"
.section .data
a: .quad 0x00 # We'll be testing this for zero...
.section .text
_start:
mov %rsp, %rbp # Initialize rbp
sub $16, %rsp # Allocate stack space
lea (a), %rax # Store pointer to a in rax...
mov %rax, -16(%rbp) # ...and then store it on stack
# Start of the equivalent of your code
mov -16(%rbp), %rax # Load pointer to a into rax
mov (%rax), %rax # Dereference pointer and get value
test %rax, %rax # Compare pointed-to value to zero
jne .notzero # Branch if not zero
# End of the equivalent of your code
.zero:
lea (iszerostr), %rsi # Address of string
mov $19, %rdx # Length of string
jmp .end
.notzero:
lea (isntzerostr), %rsi # Address of string
mov $24, %rdx # Length of string
.end:
mov $1, %rax # write() system call number
mov $1, %rdi # Standard output
syscall # Make system call
mov $60, %rax # exit() system call number
mov $0, %rdi # zero exit status
syscall # Make system call
with output:
paul@thoth:~/src/asm$ as -o tso.o tso.s; ld -o tso tso.o
paul@thoth:~/src/asm$ ./tso
Value of a is zero
paul@thoth:~/src/asm$
Incidentally, the reason for calculating an offset based on the instruction pointer is for improving the efficiency of position independent code, which is necessary for shared libraries. Hard coding memory addresses and shared libraries don't mix so well, but if you know code and data will always at least be the same distance apart, then referencing code and data via the instruction pointer gives you an easy way to produce relocatable code. Without that ability, it's usually necessary to have a layer of indirection, since relative branches are typically limited in range.