Search code examples
cgccassemblyld

Change entry point with gnu linker


I have an assembly file with a _start label as the first thing in the .text segment. I would like this label to be the entry point of my application.

Whenever I pass this file together with another file that have a function called main, that main function ends up being the entry point of my application no matter what.

I am using the GNU linker and have tried the -e _start flag, along with changing the input file order. As long as there exist a main function, it will become the entry point.. If I rename the main function, it works fine and my _start label becomes the entry point.

EDIT: Seems like it is because of -O2 flag to the compiler.

as.s

.text
.global  _start
_start:
jmp main

main.c

int main(){
    return 0;
}

Compile

gcc -O2 -c as.s -o as.o
gcc -O2 -c main.c -o main.o
ld -e _start as.o main.o -o test

Output

00000000004000b0 <main>:
  4000b0:   31 c0                   xor    %eax,%eax
  4000b2:   c3                      retq   

00000000004000b3 <_start>:
  4000b3:   e9 f8 ff ff ff          jmpq   4000b0 <main>

Any ideas?


Solution

  • It appears your question really is How can I place a particular function before all others in the generated executable?

    First thing is that doing this only has value in certain circumstances. An ELF executable has the entry point encoded in the ELF header. The placement of the entry point in the executable isn't relevant.

    One special circumstance is a non-mulitboot compatible kernel where a custom bootloader loads a kernel that was generated by GCC and converted to binary output. Looking through your question history suggests that bootloader / kernel development might be a possibility for your requirement.


    When using GCC you can't assume that the generated code will be in the order you want. As you have found out options (like optimizations) may reorder the functions relative to each other or eliminate some altogether.

    One way to put a function first in an ELF executable is to place it into its own section and then create a linker script to position that section first. An example linker script link.ld that should work with C would be:

    /*OUTPUT_FORMAT("elf32-i386");*/
    OUTPUT_FORMAT("elf64-x86-64");
    
    ENTRY(_start);
    
    SECTIONS
    {
        /* This should be your memory offset (VMA) where the code and data
         * will be loaded. In Linux this is 0x400000, multiboot loader is
         * 0x100000 etc */
        . = 0x400000;
    
        /* Place special section .text.prologue before everything else */
        .text : {
            *(.text.prologue);
            *(.text*);
        }
    
        /* Output the data sections */
        .data : {
            *(.data*);
        }
    
        .rodata : {
            *(.rodata*);
        }
    
        /* The BSS section for uniitialized data */
        .bss : {
            __bss_start = .;
            *(COMMON);
            *(.bss);
            . = ALIGN(4);
            __bss_end = .;
        }
    
        /* Size of the BSS section in case it is needed */
        __bss_size = ((__bss_end)-(__bss_start));
    
        /* Remove the note that may be placed before the code by LD */
        /DISCARD/ : {
            *(.note.gnu.build-id);
        }
    }
    

    This script explicitly places whatever is in the section .text.prologue before any other code. We just need to place _start into that section. Your as.s file could be modified to do this:

    .global  _start
    
    # Start a special section called .text.prologue making it
    # allocatable and executable
    .section .text.prologue, "ax"
    
    _start:
    jmp main
    
    .text
    # All other regular code in the normal .text section
    

    You'd compile, assemble and link them like this:

    gcc -O2 -c main.c -o main.o
    gcc -O2 -c as.s -o as.o
    ld -Tlink.ld main.o as.o -o test
    

    An objdump -D test should show the function _start before main:

    test:     file format elf32-i386
    
    
    Disassembly of section .text:
    
    00400000 <_start>:
      400000:       e9 0b 00 00 00          jmp    400010 <main>
      400005:       66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%eax,%eax,1)
      40000c:       00 00 00
      40000f:       90                      nop
    
    00400010 <main>:
      400010:       31 c0                   xor    %eax,%eax
      400012:       c3                      ret