Search code examples
assemblyx86-64gnu-assembleratt

What is the difference between quad operators and long operators


Simply put, I have the following code:

#include <stdio.h>
#define MAXNO 100
void selectionSort(int [], int);
int main() // main.c
{
int no = 0, i ;
int data[MAXNO] ;
printf("Enter the data, terminate with Ctrl+D\n") ;
while(scanf("%d", &data[no]) != EOF) ++no;
selectionSort(data, no) ;
printf("Data in sorted Order are: ") ;
for(i = 0; i < no; ++i) printf("%d ", data[i]);
putchar('\n') ;
return 0 ;
}

And I have the following Assembly code generated when I run cc -S -Wall main.c

    .file   "main.c"
    .section    .rodata
    .align 8
.LC0:
    .string "Enter the data, terminate with Ctrl+D"
.LC1:
    .string "%d"
.LC2:
    .string "Data in sorted Order are: "
.LC3:
    .string "%d "
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $416, %rsp
    movl    $0, -408(%rbp)
    movl    $.LC0, %edi
    call    puts
    jmp .L2
.L3:
    addl    $1, -408(%rbp)
.L2:
    leaq    -400(%rbp), %rax
    movl    -408(%rbp), %edx
    movslq  %edx, %rdx
    salq    $2, %rdx
    addq    %rdx, %rax
    movq    %rax, %rsi
    movl    $.LC1, %edi
    movl    $0, %eax
    call    __isoc99_scanf
    cmpl    $-1, %eax
    jne .L3
    movl    -408(%rbp), %edx
    leaq    -400(%rbp), %rax
    movl    %edx, %esi
    movq    %rax, %rdi
    call    selectionSort
    movl    $.LC2, %edi
    movl    $0, %eax
    call    printf
    movl    $0, -404(%rbp)
    jmp .L4
.L5:
    movl    -404(%rbp), %eax
    cltq
    movl    -400(%rbp,%rax,4), %eax
    movl    %eax, %esi
    movl    $.LC3, %edi
    movl    $0, %eax
    call    printf
    addl    $1, -404(%rbp)
.L4:
    movl    -404(%rbp), %eax
    cmpl    -408(%rbp), %eax
    jl  .L5
    movl    $10, %edi
    call    putchar
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4"
    .section    .note.GNU-stack,"",@progbits

The thing I dont get is:

What is the difference between quad and long operations like movq and movl. I know that one is for 64bit and the other for 32bit operations but here the integers are taken as 32bit right? So why is there a mix of movqs and movls (or any other operation, for that matter) in the code?

Edit:

Forgot to add details of my system:

Linux Z510 3.13.0-58-generic #97-Ubuntu SMP Wed Jul 8 02:56:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

Solution

  • The size of a "Long" or "Integer" value from the C perspective is dependent on the compiler, not the architecture. You are guaranteed that a long C type will be at least as long as an unsigned int but nothing beyond that. This is why the bit length specific types were added into the C standard, allowing you to define the size by need. It appears that your compiler chooses to use 32 bit integers, like most C implementations on 64-bit ISAs.

    Added to this, there are references to values on the stack. The stack width on the system you are dealing with appears to be 64 bits, which makes sense since it will allow you to push the content of a single register and remain 64 bit aligned. (But that's only relevant for saving/restoring call-preserved registers here: locals on the stack can be packed together inside an 8-byte stack "slot". Each function arg goes in a separate slot, but all these functions have fewer than 6 args so the x86-64 System V calling convention passes them all in registers.)


    In x86, a "word" is 16 bits (they didn't rename existing terminology when extending 16-bit 8086 to 386 or to AMD64).

    In Intel terminology, a double-word is 32-bit, a quad-word is 64-bit. (like cdqe sign-extending a dword to a qword).

    The AT&T operand-size suffixes are l for a dword, q for a qword. The l might be for "long", dating back to early 386 or earlier days when the syntax was being designed, and is unrelated to the width of a long in C.

    For example, movl $0, -408(%rbp) is probably the int no = 0 initializer, storing a 4-byte 0. In many cases the suffix is redundant, with size implied by a register operand, like movl $.LC1, %edi to put an address in a register (in a Linux non-PIE executable, so symbol absolute addresses are guaranteed to fit in a 32-bit immediate.) EDI is the low 32 bits of RDI. Writing it implicitly zero-extends into RDI.