Search code examples
cassemblyx86ldatt

Default value of _start


Let's say I have the following assembly program:

.globl _start
_start:
    mov $1, %eax
    int $0x80

And I assemble/link it with:

$ as file.s
$ ld a.out -o a

This will run fine, and return the status code of 0 to linux. However, when I remove the line .globl start I get the following error:

ld: warning: cannot find entry symbol _start; defaulting to 0000000000400078

What does 0000000000400078 mean? And also, if ld expects the _start symbol on entry, why is it even necessary to declare .globl _start ?


Solution

  • However, when I remove the line .globl _start ...

    The .globl line means that the name _start is "visible" outside the file file.s. If you remove that line, the name _start is only for use inside the file file.s and in a larger program (containing multiple files) you could even use the name _start in multiple files.

    (This is similar to static variables in C/C++: If you generate assembler code from C or C++, the difference between real global variables and static variables is that there is a .globl line for the global variables and no .globl line for static variables. And if you are familiar with C, you know that static variables cannot be used in other files.)

    The linker (ld) is also not able to use the name _start if it can be used inside the file only.

    What does 0000000000400078 mean?

    Obviously 0x400078 is the address of the first byte of your program. ld assumes that the program starts at the first byte if no symbol named _start is found.

    ... why is it even necessary to declare .globl _start?

    It is not guaranteed that _start is located at the first byte of your program.

    Counterexample:

    .globl _start
    
    write_stdout:
        mov $4, %eax
        mov $1, %ebx
        int $0x80
        ret
    
    exit:
        mov $1, %eax
        mov $0, %ebx
        int $0x80
        jmp exit
    
    _start:
        mov $text, %ecx
        mov $(textend-text), %edx
        call write_stdout
        mov $text2, %ecx
        mov $(textend2-text2), %edx
        call write_stdout
        call exit
    
    text:
        .ascii "Hello\n"
    textend:
    text2:
        .ascii "World\n"
    textend2:
    

    If you remove the .globl line, ld will not be able to find the _start: line and assume that your program starts at the first byte - which is the write_stdout: line!

    ... and if you have multiple .s files in a larger program (or even a combination of .s, .c and .cc), you don't have control about which code is located at the first byte of your program!