Search code examples
cgccx8632-bit

How are numbers greater than 2^32 handled by a 32 bit machine?


I am trying to understand how calculations involving numbers greater than 232 happen on a 32 bit machine.

C code

$ cat size.c
#include<stdio.h>
#include<math.h>

int main() {

    printf ("max unsigned long long = %llu\n",
    (unsigned long long)(pow(2, 64) - 1));
}
$

gcc output

$ gcc size.c -o size
$ ./size
max unsigned long long = 18446744073709551615
$

Corresponding assembly code

$ gcc -S size.c -O3
$ cat size.s
    .file   "size.c"
    .section    .rodata.str1.4,"aMS",@progbits,1
    .align 4
.LC0:
    .string "max unsigned long long = %llu\n"
    .text
    .p2align 4,,15
.globl main
    .type   main, @function
main:
    pushl   %ebp
    movl    %esp, %ebp
    andl    $-16, %esp
    subl    $16, %esp
    movl    $-1, 8(%esp)   #1
    movl    $-1, 12(%esp)  #2
    movl    $.LC0, 4(%esp) #3
    movl    $1, (%esp)     #4
    call    __printf_chk
    leave
    ret
    .size   main, .-main
    .ident  "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"
    .section    .note.GNU-stack,"",@progbits
$

What exactly happens on the lines 1 - 4?

Is this some kind of string concatenation at the assembly level?


Solution

  • __printf_chk is a wrapper around printf which checks for stack overflow, and takes an additional first parameter, a flag (e.g. see here.)

    pow(2, 64) - 1 has been optimised to 0xffffffffffffffff as the arguments are constants.

    As per the usual calling conventions, the first argument to __printf_chk() (int flag) is a 32-bit value on the stack (at %esp at the time of the call instruction). The next argument, const char * format, is a 32-bit pointer (the next 32-bit word on the stack, i.e. at %esp+4). And the 64-bit quantity that is being printed occupies the next two 32-bit words (at %esp+8 and %esp+12):

    pushl   %ebp                 ; prologue
    movl    %esp, %ebp           ; prologue
    andl    $-16, %esp           ; align stack pointer
    subl    $16, %esp            ; reserve bytes for stack frame
    movl    $-1, 8(%esp)   #1    ; store low half of 64-bit argument (a constant) to stack
    movl    $-1, 12(%esp)  #2    ; store high half of 64-bit argument (a constant) to stack
    movl    $.LC0, 4(%esp) #3    ; store address of format string to stack
    movl    $1, (%esp)     #4    ; store "flag" argument to __printf_chk to stack
    call    __printf_chk         ; call routine
    leave                        ; epilogue
    ret                          ; epilogue
    

    The compiler has effectively rewritten this:

    printf("max unsigned long long = %llu\n", (unsigned long long)(pow(2, 64) - 1));
    

    ...into this:

    __printf_chk(1, "max unsigned long long = %llu\n", 0xffffffffffffffffULL);
    

    ...and, at runtime, the stack layout for the call looks like this (showing the stack as 32-bit words, with addresses increasing from the bottom of the diagram upwards):

            :                 :
            :     Stack       :
            :                 :
            +-----------------+
    %esp+12 |      0xffffffff | \ 
            +-----------------+  } <-------------------------------------.
    %esp+8  |      0xffffffff | /                                        |
            +-----------------+                                          |
    %esp+4  |address of string| <---------------.                        |
            +-----------------+                 |                        |
    %esp    |               1 | <--.            |                        |
            +-----------------+    |            |                        |
                      __printf_chk(1, "max unsigned long long = %llu\n", |
                                                        0xffffffffffffffffULL);