Simply put, I have the following code:
#include <stdio.h>
#define MAXNO 100
void selectionSort(int [], int);
int main() // main.c
{
int no = 0, i ;
int data[MAXNO] ;
printf("Enter the data, terminate with Ctrl+D\n") ;
while(scanf("%d", &data[no]) != EOF) ++no;
selectionSort(data, no) ;
printf("Data in sorted Order are: ") ;
for(i = 0; i < no; ++i) printf("%d ", data[i]);
putchar('\n') ;
return 0 ;
}
And I have the following Assembly code generated when I run
cc -S -Wall main.c
.file "main.c"
.section .rodata
.align 8
.LC0:
.string "Enter the data, terminate with Ctrl+D"
.LC1:
.string "%d"
.LC2:
.string "Data in sorted Order are: "
.LC3:
.string "%d "
.text
.globl main
.type main, @function
main:
.LFB0:
.cfi_startproc
pushq %rbp
.cfi_def_cfa_offset 16
.cfi_offset 6, -16
movq %rsp, %rbp
.cfi_def_cfa_register 6
subq $416, %rsp
movl $0, -408(%rbp)
movl $.LC0, %edi
call puts
jmp .L2
.L3:
addl $1, -408(%rbp)
.L2:
leaq -400(%rbp), %rax
movl -408(%rbp), %edx
movslq %edx, %rdx
salq $2, %rdx
addq %rdx, %rax
movq %rax, %rsi
movl $.LC1, %edi
movl $0, %eax
call __isoc99_scanf
cmpl $-1, %eax
jne .L3
movl -408(%rbp), %edx
leaq -400(%rbp), %rax
movl %edx, %esi
movq %rax, %rdi
call selectionSort
movl $.LC2, %edi
movl $0, %eax
call printf
movl $0, -404(%rbp)
jmp .L4
.L5:
movl -404(%rbp), %eax
cltq
movl -400(%rbp,%rax,4), %eax
movl %eax, %esi
movl $.LC3, %edi
movl $0, %eax
call printf
addl $1, -404(%rbp)
.L4:
movl -404(%rbp), %eax
cmpl -408(%rbp), %eax
jl .L5
movl $10, %edi
call putchar
movl $0, %eax
leave
.cfi_def_cfa 7, 8
ret
.cfi_endproc
.LFE0:
.size main, .-main
.ident "GCC: (Ubuntu 4.8.4-2ubuntu1~14.04) 4.8.4"
.section .note.GNU-stack,"",@progbits
The thing I dont get is:
What is the difference between quad and long operations like movq
and movl
. I know that one is for 64bit and the other for 32bit operations but here the integers are taken as 32bit right? So why is there a mix of movq
s and movl
s (or any other operation, for that matter) in the code?
Forgot to add details of my system:
Linux Z510 3.13.0-58-generic #97-Ubuntu SMP Wed Jul 8 02:56:15 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux
The size of a "Long" or "Integer" value from the C perspective is dependent on the compiler, not the architecture. You are guaranteed that a long C type will be at least as long as an unsigned int
but nothing beyond that. This is why the bit length specific types were added into the C standard, allowing you to define the size by need. It appears that your compiler chooses to use 32 bit integers, like most C implementations on 64-bit ISAs.
Added to this, there are references to values on the stack. The stack width on the system you are dealing with appears to be 64 bits, which makes sense since it will allow you to push the content of a single register and remain 64 bit aligned. (But that's only relevant for saving/restoring call-preserved registers here: locals on the stack can be packed together inside an 8-byte stack "slot". Each function arg goes in a separate slot, but all these functions have fewer than 6 args so the x86-64 System V calling convention passes them all in registers.)
In x86, a "word" is 16 bits (they didn't rename existing terminology when extending 16-bit 8086 to 386 or to AMD64).
In Intel terminology, a double-word is 32-bit, a quad-word is 64-bit. (like cdqe
sign-extending a dword to a qword).
The AT&T operand-size suffixes are l
for a dword, q
for a qword. The l
might be for "long", dating back to early 386 or earlier days when the syntax was being designed, and is unrelated to the width of a long
in C.
For example, movl $0, -408(%rbp)
is probably the int no = 0
initializer, storing a 4-byte 0. In many cases the suffix is redundant, with size implied by a register operand, like movl $.LC1, %edi
to put an address in a register (in a Linux non-PIE executable, so symbol absolute addresses are guaranteed to fit in a 32-bit immediate.) EDI is the low 32 bits of RDI. Writing it implicitly zero-extends into RDI.