Search code examples
arraysassemblyx86-64att

Array element comparison in x86-64 Assembly (AT&T syntax)


I'm trying to write a simple procedure in x86-64 assembly that simply returns the length of an array of ints. The last element in the array is a 0, which should not be counted. The array is passed in as an int * from C code.

My assembly code is as follows:

f1:
    movq $0, %rax   # zero out %rax
    jmp   test      # jump to test
body:
    incq  %rax      # increment %rax, which is counter and array index

test:
    cmpq   $0, (%rdi,%rax,4)  # compare (rdi + (rax * 4)) to 0
    jne    body   # jump if zero flag is not set
ret

When this runs, I get a result that is not correct, but not wildly incorrect either, so instead of 11 (size of array passed minus ending 0) I get 38. What I think is happening is that my compare statement is incorrect. My thinking was that since cmpq performs (dest - src) without altering the registers, if the array index is 0, 0-0 would yield a zero so the zero flag would be set, but that doesn't seem to be happening.

I can arbitrarily load any element of the array into %rax, which returns the correct value:

movq   (%rdi,%rax,4), %rax   # %rax initially 0, so first element loaded into %rax

Any help would be greatly appreciated!


Solution

  • int is 32 bits (4 bytes) in both x86-64 ABIs (SystemV and Windows). (See the tag wiki for details).

    cmpq $0, (%rdi,%rax,4) correctly scales the index by 4, but incorrectly uses 64bit operand-size. (q stands for quad-word. In Intel's x86 terminology, a "word" is 16 bits.)

    cmpq was comparing two consecutive elements. The equivalent C would be while( 0 != *(int64_t*)&(array[i]) ){ ++i; }


    Outside of x86, a word is usually the register size of the machine or something like that, so it matches the size of long. e.g. a word is 32bits on 32bit MIPS.

    It's just terminology, and it's handy to have convenient names like word (AT&T syntax w suffix), dword (l suffix), qword (q suffix).

    In gdb, in some places "word" is 32bits even when debugging x86 (e.g. the x command to dump memory has b (byte), h (half word: 16b), w (word), and g (giant: 8B) size format specifiers.