Search code examples
clinkerldobject-files

Generating an object file ( .o ) for a linker


I want to create a c program that creates .o files for the linker on my computer (ld). Gcc does this when I do the command gcc -c myfile.c. Are there any resources that show how to make an object file for a linker?


Solution

  • In order to create a file similar to a file produced by gcc -c, it is important to first understand the format of the file produced.

    First, create a file with gcc, and then see if it's format can be reverse-engineered. I will use the following C program (hello.c) to perform this task:

    #include <stdio.h>
    
    int main(void)
       {
       printf("Hello world\n");
       return(0);
       }
    

    Now, compile the file without linking it:

    gcc -Wall -c -o hello.o hello.c
    

    The above code will create hello.o from hello.c. This file is compiled, but not yet linked.

    Many files today contain bytes in the header of the file that help identify its format. These identifying bytes are termed 'file magic'. Google 'file magic' and you can find details on how to identify may types of files.

    To identify this file type, look for a magic number in the first few lines of a hexdump of the file:

    > hexdump -Cn 64 hello.o
    00000000  7f 45 4c 46 02 01 01 00  00 00 00 00 00 00 00 00  |.ELF............|
    00000010  02 00 3e 00 01 00 00 00  50 04 40 00 00 00 00 00  |..>.....P.@.....|
    00000020  40 00 00 00 00 00 00 00  18 1a 00 00 00 00 00 00  |@...............|
    00000030  00 00 00 00 40 00 38 00  09 00 40 00 2a 00 27 00  |....@.8...@.*.'.|
    00000040
    

    The identity of the file's format is revealed, in this case, by the first few bytes of the file; 45 4c 46 = ELF. Hence, the output of gcc -c is a file in the ElF format.

    Hence, in order to create similar output from a C program, an understanding of the ELF file format is required.

    Many *nix systems include a program called readelf that will translate the contents of an ELF file. For example:

    > readelf -a hello.o
    ELF Header:
      Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
      Class:                             ELF64
      Data:                              2's complement, little endian
      Version:                           1 (current)
      OS/ABI:                            UNIX - System V
      ABI Version:                       0
      Type:                              EXEC (Executable file)
      Machine:                           Advanced Micro Devices X86-64
      Version:                           0x1
      Entry point address:               0x400450
      Start of program headers:          64 (bytes into file)
      Start of section headers:          6680 (bytes into file)
      Flags:                             0x0
      Size of this header:               64 (bytes)
      Size of program headers:           56 (bytes)
      Number of program headers:         9
      Size of section headers:           64 (bytes)
      Number of section headers:         42
      Section header string table index: 39
    
    Section Headers:
      [Nr] Name              Type             Address           Offset
           Size              EntSize          Flags  Link  Info  Align
      [ 0]                   NULL             0000000000000000  00000000
           0000000000000000  0000000000000000           0     0     0
      [ 1] .interp           PROGBITS         0000000000400238  00000238
           000000000000001c  0000000000000000   A       0     0     1
      [ 2] .note.ABI-tag     NOTE             0000000000400254  00000254
           0000000000000020  0000000000000000   A       0     0     4
      [ 3] .note.SuSE        NOTE             0000000000400274  00000274
           0000000000000018  0000000000000000   A       0     0     4
      [ 4] .note.gnu.build-i NOTE             000000000040028c  0000028c
           0000000000000024  0000000000000000   A       0     0     4
      [ 5] .hash             HASH             00000000004002b0  000002b0
           0000000000000024  0000000000000004   A       7     0     8
      [ 6] .gnu.hash         GNU_HASH         00000000004002d8  000002d8
           000000000000001c  0000000000000000   A       7     0     8
      [ 7] .dynsym           DYNSYM           00000000004002f8  000002f8
           0000000000000060  0000000000000018   A       8     1     8
      [ 8] .dynstr           STRTAB           0000000000400358  00000358
           000000000000003d  0000000000000000   A       0     0     1
      [ 9] .gnu.version      VERSYM           0000000000400396  00000396
           0000000000000008  0000000000000002   A       7     0     2
      [10] .gnu.version_r    VERNEED          00000000004003a0  000003a0
           0000000000000020  0000000000000000   A       8     1     8
      [11] .rela.dyn         RELA             00000000004003c0  000003c0
           0000000000000018  0000000000000018   A       7     0     8
      [12] .rela.plt         RELA             00000000004003d8  000003d8
           0000000000000030  0000000000000018   A       7    14     8
      [13] .init             PROGBITS         0000000000400408  00000408
           0000000000000018  0000000000000000  AX       0     0     4
      [14] .plt              PROGBITS         0000000000400420  00000420
           0000000000000030  0000000000000010  AX       0     0     16
      [15] .text             PROGBITS         0000000000400450  00000450
           00000000000001e8  0000000000000000  AX       0     0     16
      [16] .fini             PROGBITS         0000000000400638  00000638
           0000000000000016  0000000000000000  AX       0     0     4
      [17] .rodata           PROGBITS         0000000000400650  00000650
           0000000000000010  0000000000000000   A       0     0     4
      [18] .eh_frame_hdr     PROGBITS         0000000000400660  00000660
           0000000000000034  0000000000000000   A       0     0     4
      [19] .eh_frame         PROGBITS         0000000000400698  00000698
           00000000000000dc  0000000000000000   A       0     0     8
      [20] .ctors            PROGBITS         0000000000600e30  00000e30
           0000000000000010  0000000000000000  WA       0     0     8
      [21] .dtors            PROGBITS         0000000000600e40  00000e40
           0000000000000010  0000000000000000  WA       0     0     8
      [22] .jcr              PROGBITS         0000000000600e50  00000e50
           0000000000000008  0000000000000000  WA       0     0     8
      [23] .dynamic          DYNAMIC          0000000000600e58  00000e58
           00000000000001a0  0000000000000010  WA       8     0     8
      [24] .got              PROGBITS         0000000000600ff8  00000ff8
           0000000000000008  0000000000000008  WA       0     0     8
      [25] .got.plt          PROGBITS         0000000000601000  00001000
           0000000000000028  0000000000000008  WA       0     0     8
      [26] .data             PROGBITS         0000000000601028  00001028
           0000000000000010  0000000000000000  WA       0     0     8
      [27] .bss              NOBITS           0000000000601038  00001038
           0000000000000010  0000000000000000  WA       0     0     8
      [28] .comment          PROGBITS         0000000000000000  00001038
           0000000000000039  0000000000000001  MS       0     0     1
      [29] .comment.SUSE.OPT PROGBITS         0000000000000000  00001071
           0000000000000006  0000000000000001  MS       0     0     1
      [30] .debug_aranges    PROGBITS         0000000000000000  00001080
           0000000000000060  0000000000000000           0     0     16
      [31] .debug_pubnames   PROGBITS         0000000000000000  000010e0
           000000000000005f  0000000000000000           0     0     1
      [32] .debug_info       PROGBITS         0000000000000000  0000113f
           0000000000000232  0000000000000000           0     0     1
      [33] .debug_abbrev     PROGBITS         0000000000000000  00001371
           0000000000000133  0000000000000000           0     0     1
      [34] .debug_line       PROGBITS         0000000000000000  000014a4
           000000000000011e  0000000000000000           0     0     1
      [35] .debug_frame      PROGBITS         0000000000000000  000015c8
           0000000000000058  0000000000000000           0     0     8
      [36] .debug_str        PROGBITS         0000000000000000  00001620
           0000000000000115  0000000000000001  MS       0     0     1
      [37] .debug_loc        PROGBITS         0000000000000000  00001735
           00000000000000fe  0000000000000000           0     0     1
      [38] .debug_ranges     PROGBITS         0000000000000000  00001833
           0000000000000050  0000000000000000           0     0     1
      [39] .shstrtab         STRTAB           0000000000000000  00001883
           0000000000000192  0000000000000000           0     0     1
      [40] .symtab           SYMTAB           0000000000000000  00002498
           00000000000007c8  0000000000000018          41    65     8
      [41] .strtab           STRTAB           0000000000000000  00002c60
           0000000000000244  0000000000000000           0     0     1
    Key to Flags:
      W (write), A (alloc), X (execute), M (merge), S (strings), l (large)
      I (info), L (link order), G (group), T (TLS), E (exclude), x (unknown)
    O (extra OS processing required) o (OS specific), p (processor specific)
    
    There are no section groups in this file.
    
    Program Headers:
      Type           Offset             VirtAddr           PhysAddr
                     FileSiz            MemSiz              Flags  Align
      PHDR           0x0000000000000040 0x0000000000400040 0x0000000000400040
                     0x00000000000001f8 0x00000000000001f8  R E    8
      INTERP         0x0000000000000238 0x0000000000400238 0x0000000000400238
                     0x000000000000001c 0x000000000000001c  R      1
          [Requesting program interpreter: /lib64/ld-linux-x86-64.so.2]
      LOAD           0x0000000000000000 0x0000000000400000 0x0000000000400000
                     0x0000000000000774 0x0000000000000774  R E    200000
      LOAD           0x0000000000000e30 0x0000000000600e30 0x0000000000600e30
                     0x0000000000000208 0x0000000000000218  RW     200000
      DYNAMIC        0x0000000000000e58 0x0000000000600e58 0x0000000000600e58
                     0x00000000000001a0 0x00000000000001a0  RW     8
      NOTE           0x0000000000000254 0x0000000000400254 0x0000000000400254
                     0x000000000000005c 0x000000000000005c  R      4
      GNU_EH_FRAME   0x0000000000000660 0x0000000000400660 0x0000000000400660
                     0x0000000000000034 0x0000000000000034  R      4
      GNU_STACK      0x0000000000000000 0x0000000000000000 0x0000000000000000
                     0x0000000000000000 0x0000000000000000  RW     8
      GNU_RELRO      0x0000000000000e30 0x0000000000600e30 0x0000000000600e30
                     0x00000000000001d0 0x00000000000001d0  R      1
    
     Section to Segment mapping:
      Segment Sections...
       00
       01     .interp
       02     .interp .note.ABI-tag .note.SuSE .note.gnu.build-id .hash .gnu.hash .dynsym .dynstr .gnu.version .gnu.version_r .rela.dyn .rela.plt .init .plt .text .fini .rodata .eh_frame_hdr .eh_frame
       03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss
       04     .dynamic
       05     .note.ABI-tag .note.SuSE .note.gnu.build-id
       06     .eh_frame_hdr
       07
       08     .ctors .dtors .jcr .dynamic .got
    
    Dynamic section at offset 0xe58 contains 21 entries:
      Tag        Type                         Name/Value
     0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
     0x000000000000000c (INIT)               0x400408
     0x000000000000000d (FINI)               0x400638
     0x0000000000000004 (HASH)               0x4002b0
     0x000000006ffffef5 (GNU_HASH)           0x4002d8
     0x0000000000000005 (STRTAB)             0x400358
     0x0000000000000006 (SYMTAB)             0x4002f8
     0x000000000000000a (STRSZ)              61 (bytes)
     0x000000000000000b (SYMENT)             24 (bytes)
     0x0000000000000015 (DEBUG)              0x0
     0x0000000000000003 (PLTGOT)             0x601000
     0x0000000000000002 (PLTRELSZ)           48 (bytes)
     0x0000000000000014 (PLTREL)             RELA
     0x0000000000000017 (JMPREL)             0x4003d8
     0x0000000000000007 (RELA)               0x4003c0
     0x0000000000000008 (RELASZ)             24 (bytes)
     0x0000000000000009 (RELAENT)            24 (bytes)
     0x000000006ffffffe (VERNEED)            0x4003a0
     0x000000006fffffff (VERNEEDNUM)         1
     0x000000006ffffff0 (VERSYM)             0x400396
     0x0000000000000000 (NULL)               0x0
    
    Relocation section '.rela.dyn' at offset 0x3c0 contains 1 entries:
      Offset          Info           Type           Sym. Value    Sym. Name + Addend
    000000600ff8  000300000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
    
    Relocation section '.rela.plt' at offset 0x3d8 contains 2 entries:
      Offset          Info           Type           Sym. Value    Sym. Name + Addend
    000000601018  000100000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0
    000000601020  000200000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
    
    The decoding of unwind sections for machine type Advanced Micro Devices X86-64 is not currently supported.
    
    Symbol table '.dynsym' contains 4 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND puts@GLIBC_2.2.5 (2)
         2: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@GLIBC_2.2.5 (2)
       3: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
    
    Symbol table '.symtab' contains 83 entries:
       Num:    Value          Size Type    Bind   Vis      Ndx Name
         0: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT  UND
         1: 0000000000400238     0 SECTION LOCAL  DEFAULT    1
         2: 0000000000400254     0 SECTION LOCAL  DEFAULT    2
         3: 0000000000400274     0 SECTION LOCAL  DEFAULT    3
         4: 000000000040028c     0 SECTION LOCAL  DEFAULT    4
         5: 00000000004002b0     0 SECTION LOCAL  DEFAULT    5
         6: 00000000004002d8     0 SECTION LOCAL  DEFAULT    6
         7: 00000000004002f8     0 SECTION LOCAL  DEFAULT    7
         8: 0000000000400358     0 SECTION LOCAL  DEFAULT    8
         9: 0000000000400396     0 SECTION LOCAL  DEFAULT    9
        10: 00000000004003a0     0 SECTION LOCAL  DEFAULT   10
        11: 00000000004003c0     0 SECTION LOCAL  DEFAULT   11
        12: 00000000004003d8     0 SECTION LOCAL  DEFAULT   12
        13: 0000000000400408     0 SECTION LOCAL  DEFAULT   13
        14: 0000000000400420     0 SECTION LOCAL  DEFAULT   14
        15: 0000000000400450     0 SECTION LOCAL  DEFAULT   15
        16: 0000000000400638     0 SECTION LOCAL  DEFAULT   16
        17: 0000000000400650     0 SECTION LOCAL  DEFAULT   17
        18: 0000000000400660     0 SECTION LOCAL  DEFAULT   18
        19: 0000000000400698     0 SECTION LOCAL  DEFAULT   19
        20: 0000000000600e30     0 SECTION LOCAL  DEFAULT   20
        21: 0000000000600e40     0 SECTION LOCAL  DEFAULT   21
        22: 0000000000600e50     0 SECTION LOCAL  DEFAULT   22
        23: 0000000000600e58     0 SECTION LOCAL  DEFAULT   23
        24: 0000000000600ff8     0 SECTION LOCAL  DEFAULT   24
        25: 0000000000601000     0 SECTION LOCAL  DEFAULT   25
        26: 0000000000601028     0 SECTION LOCAL  DEFAULT   26
        27: 0000000000601038     0 SECTION LOCAL  DEFAULT   27
        28: 0000000000000000     0 SECTION LOCAL  DEFAULT   28
        29: 0000000000000000     0 SECTION LOCAL  DEFAULT   29
        30: 0000000000000000     0 SECTION LOCAL  DEFAULT   30
        31: 0000000000000000     0 SECTION LOCAL  DEFAULT   31
        32: 0000000000000000     0 SECTION LOCAL  DEFAULT   32
        33: 0000000000000000     0 SECTION LOCAL  DEFAULT   33
        34: 0000000000000000     0 SECTION LOCAL  DEFAULT   34
        35: 0000000000000000     0 SECTION LOCAL  DEFAULT   35
        36: 0000000000000000     0 SECTION LOCAL  DEFAULT   36
        37: 0000000000000000     0 SECTION LOCAL  DEFAULT   37
        38: 0000000000000000     0 SECTION LOCAL  DEFAULT   38
        39: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS init.c
        40: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS
        41: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS initfini.c
        42: 000000000040047c     0 FUNC    LOCAL  DEFAULT   15 call_gmon_start
        43: 0000000000400648     0 NOTYPE  LOCAL  DEFAULT   16 _real_fini
        44: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
        45: 0000000000600e30     0 OBJECT  LOCAL  DEFAULT   20 __CTOR_LIST__
        46: 0000000000600e40     0 OBJECT  LOCAL  DEFAULT   21 __DTOR_LIST__
        47: 0000000000600e50     0 OBJECT  LOCAL  DEFAULT   22 __JCR_LIST__
        48: 00000000004004a0     0 FUNC    LOCAL  DEFAULT   15 __do_global_dtors_aux
        49: 0000000000601038     1 OBJECT  LOCAL  DEFAULT   27 completed.6159
        50: 0000000000601040     8 OBJECT  LOCAL  DEFAULT   27 dtor_idx.6161
        51: 0000000000400510     0 FUNC    LOCAL  DEFAULT   15 frame_dummy
        52: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS crtstuff.c
        53: 0000000000600e38     0 OBJECT  LOCAL  DEFAULT   20 __CTOR_END__
        54: 0000000000400770     0 OBJECT  LOCAL  DEFAULT   19 __FRAME_END__
        55: 0000000000600e50     0 OBJECT  LOCAL  DEFAULT   22 __JCR_END__
        56: 0000000000400600     0 FUNC    LOCAL  DEFAULT   15 __do_global_ctors_aux
        57: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS initfini.c
        58: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS 24173361_generating-an-ob
        59: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS elf-init.c
        60: 0000000000000000     0 FILE    LOCAL  DEFAULT  ABS
        61: 0000000000600e2c     0 NOTYPE  LOCAL  DEFAULT   20 __init_array_end
        62: 0000000000600e58     0 OBJECT  LOCAL  DEFAULT   23 _DYNAMIC
        63: 0000000000600e2c     0 NOTYPE  LOCAL  DEFAULT   20 __init_array_start
        64: 0000000000601000     0 OBJECT  LOCAL  DEFAULT   25 _GLOBAL_OFFSET_TABLE_
        65: 0000000000400560     2 FUNC    GLOBAL DEFAULT   15 __libc_csu_fini
        66: 0000000000601028     0 NOTYPE  WEAK   DEFAULT   26 data_start
        67: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND puts@@GLIBC_2.2.5
        68: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   26 _edata
        69: 0000000000400638    16 FUNC    GLOBAL DEFAULT   16 _fini
        70: 0000000000600e48     0 OBJECT  GLOBAL HIDDEN    21 __DTOR_END__
        71: 0000000000000000     0 FUNC    GLOBAL DEFAULT  UND __libc_start_main@@GLIBC_
        72: 0000000000601028     0 NOTYPE  GLOBAL DEFAULT   26 __data_start
        73: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND __gmon_start__
        74: 0000000000601030     0 OBJECT  GLOBAL HIDDEN    26 __dso_handle
        75: 0000000000400650     4 OBJECT  GLOBAL DEFAULT   17 _IO_stdin_used
        76: 0000000000400570   137 FUNC    GLOBAL DEFAULT   15 __libc_csu_init
        77: 0000000000601048     0 NOTYPE  GLOBAL DEFAULT   27 _end
        78: 0000000000400450     0 FUNC    GLOBAL DEFAULT   15 _start
        79: 0000000000601038     0 NOTYPE  GLOBAL DEFAULT   27 __bss_start
        80: 000000000040053c    21 FUNC    GLOBAL DEFAULT   15 main
        81: 0000000000000000     0 NOTYPE  WEAK   DEFAULT  UND _Jv_RegisterClasses
        82: 0000000000400408     0 FUNC    GLOBAL DEFAULT   13 _init
    
    Histogram for bucket list length (total of 3 buckets):
     Length  Number     % of total  Coverage
          0  1          ( 33.3%)
          1  1          ( 33.3%)     33.3%
          2  1          ( 33.3%)    100.0%
    
    Version symbols section '.gnu.version' contains 4 entries:
     Addr: 0000000000400396  Offset: 0x000396  Link: 7 (.dynsym)
    000:   0 (*local*)       2 (GLIBC_2.2.5)   2 (GLIBC_2.2.5)   0 (*local*)
    
    Version needs section '.gnu.version_r' contains 1 entries:
     Addr: 0x00000000004003a0  Offset: 0x0003a0  Link: 8 (.dynstr)
      000000: Version: 1  File: libc.so.6  Cnt: 1
    0x0010:   Name: GLIBC_2.2.5  Flags: none  Version: 2
    
    Notes at offset 0x00000254 with length 0x00000020:
      Owner                 Data size   Description
      GNU                  0x00000010   NT_GNU_ABI_TAG (ABI version tag)
      OS: Linux, ABI: 2.6.4
    
    Notes at offset 0x00000274 with length 0x00000018:
      Owner                 Data size   Description
      SuSE                 0x00000004   Unknown note type: (0x45537553)
    
    Notes at offset 0x0000028c with length 0x00000024:
      Owner                 Data size   Description
      GNU                  0x00000014   NT_GNU_BUILD_ID (unique build ID bitstring)
      Build ID: 710ef458701fed230f91233a227b7b10c024e2ed
    

    While, that is a lot of detail for such a little file. Here is how to break it down problematically. First, the elf(64) header:

    typedef struct elf64_hdr {
       unsigned char e_ident[EI_NIDENT];     /* ELF "magic number" */
       Elf64_Half e_type;
       Elf64_Half e_machine;
       Elf64_Word e_version;
       Elf64_Addr e_entry;           /* Entry point virtual address */
       Elf64_Off e_phoff;            /* Program header table file offset */
       Elf64_Off e_shoff;            /* Section header table file offset */
       Elf64_Word e_flags;
       Elf64_Half e_ehsize;
       Elf64_Half e_phentsize;
       Elf64_Half e_phnum;
       Elf64_Half e_shentsize;
       Elf64_Half e_shnum;
       Elf64_Half e_shstrndx;
       } Elf64_Ehdr;
    

    The secret to creating an elf file is to understand the above structure. This is the first bit of the file your C program must write. Although too lengthy to detail here, you can find this structure in elf.h; which is located (on most Linux systems) here: /usr/include/elf.h.