Search code examples
assemblyx86-64ldelfgnu-assembler

How to use GNU Assembler (GAS) to create a hand-written ELF File from the corresponding .s file


Introduction

I am trying to learn about ELF Files, and also experiment with them a little bit. Currently, I am following the tutorial here, which is about creating a tiny 32-bit ELF file that just exits with code 42. This tutorial uses NASM and ld to create the ELF files. However, I would like to use the GNU Assembler (GAS) to create a 64-bit ELF file (and not 32-bit), as I am somewhat familiar with it (but not an expert using it!)

I am currently stuck at the part of the tutorial where they start making ELF files from scratch - which involves writing ELF header and the program headers. Since 32-bit and 64-bit ELF files differ slightly (for example, in terms of ELF header size), my version of the .s file is below:

.intel_syntax noprefix
ehdr:
        .byte 0x7F
        .byte 0x45 # E
        .byte 0x4c # L
        .byte 0x46 # F
        .byte 0x02 # 64 bit ; CLASS
        .byte 0x01 # lsb ; DATA
        .byte 0x01 # ; VERSION
        .byte 0x00 # None/Sytem V ; OS ABI
        .8byte 0x0 # ABI VERSION + PADDING

        .2byte 0x02 # ET_EXEC ; E_TYPE
        .2byte 0x3E # AMD64 ; E_MACHINE
        .4byte 0x01 # 1 ; E_VERSION
        .8byte _start # ; E_ENTRY
        .8byte phdr - ehdr # offset into program header ; E_PHOFF
        .8byte 0x00 # offset into section header ; E_SHOFF
        .4byte 0x00 # flag ; E_FLAGS
        .2byte ehdrsize # ELF header size ; E_EHSIZE
        .2byte phdrsize # Program header size ; E_PHSIZE
        .2byte 0x01 # Number of program headers ; E_PHNUM
        .2byte 0x00 # Section header size ; E_SHENTSIZE
        .2byte 0x00 # Number of section headers ; E_SHNUM
        .2byte 0x00 # Section header string table index ; E_SHSTRNDX

DECLARE_ELF_HEADER_SIZE:
.set ehdrsize,  DECLARE_ELF_HEADER_SIZE - ehdr

phdr:
        .4byte 0x01 # PT_LOAD ; P_TYPE
        .4byte 0x00 # located at offset 0??? ; P_OFFSET
        .8byte ehdr # ; P_VADDR
        .8byte ehdr # ; P_PADDR
        .8byte filesize # ; P_FILESIZE
        .8byte filesize # ; P_MEMSZ
        .8byte 5 # R-X ; P_FLAGS
        .8byte 0x1000 # P_ALIGN

DECLARE_PHEADER_SIZE:
.set phdrsize, DECLARE_PHEADER_SIZE - phdr

_start:
        mov eax, 60
        mov edi, 42
        syscall

DECLARE_FILE_SIZE:
.set filesize, DECLARE_FILE_SIZE - ehdr

More on ELF file headers can be found by looking at the corresponding source code here. As I am still learning about ELF files, the above code might be incorrect (please let me know in case you find any fault!)

Problem

Assuming the above code is correct, I would like to convert the above .s file into an ELF file. For this, the tutorial uses the command:

$ nasm -f bin -o a.out tiny.asm

However, I am unable to find out what corresponding command I should use with GAS to create the corresponding ELF file, and that is the problem that I am facing.

What I Have Tried...

I have tried various approaches. I will list them below:

  1. There is a similar question on SO, which can be found here. This approach uses the following 2 commands to get the corresponding ELF file:
$ as --64 -o test.o test.s 
$ ld -Ttext 200000 --oformat binary -o test.bin test.o

Running the same commands for the .s file I have written, I do get an ELF file, along with the following warning:

ld: warning: cannot find entry symbol _start; defaulting to 0000000000200000

Running the ELF file generated gets a segmentation fault, and exits with code 139. The ELF header for the generated file is below (obtained by running readelf -h <output ELF file>):

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x200078
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           0 (bytes)
  Number of section headers:         0
  Section header string table index: 0
readelf: Error: the segment's file size is larger than its memory size

I believe the segmentation fault is happening because the Entry Point address is most likely incorrect (that was also hinted by ld in the error message).

  1. In this approach, I modified the commands:
$ as --64 -o test.o test.s 
$ ld -Ttext 200000 --oformat binary -o test.bin test.o

to

$ as --64 -o test.o test.s (same as before) 
$ ld -nmagic -s test.o (changed)

This also creates an ELF file, but again with the following warning:

ld: warning: cannot find entry symbol _start; defaulting to 0000000000400078

And this again hints that we are going to get a segmentation fault, which is indeed the case with the new ELF file. The ELF header for this file is:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x400078
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           64 (bytes)
  Number of section headers:         0
  Section header string table index: 0

No luck!

  1. From the warnings given by ld, it looks like we might want to add .global _start in our .s file. Below is the new .s file that contains this line as well:
.intel_syntax noprefix
.global _start   # Added here
ehdr:
        .byte 0x7F
        .byte 0x45 # E
        .byte 0x4c # L
        .byte 0x46 # F
        .byte 0x02 # 64 bit ; CLASS
        .byte 0x01 # lsb ; DATA
        .byte 0x01 # ; VERSION
        .byte 0x00 # None/Sytem V ; OS ABI
        .8byte 0x0 # ABI VERSION + PADDING

        .2byte 0x02 # ET_EXEC ; E_TYPE
        .2byte 0x3E # AMD64 ; E_MACHINE
        .4byte 0x01 # 1 ; E_VERSION
        .8byte _start # ; E_ENTRY
        .8byte phdr - ehdr # offset into program header ; E_PHOFF
        .8byte 0x00 # offset into section header ; E_SHOFF
        .4byte 0x00 # flag ; E_FLAGS
        .2byte ehdrsize # ELF header size ; E_EHSIZE
        .2byte phdrsize # Program header size ; E_PHSIZE
        .2byte 0x01 # Number of program headers ; E_PHNUM
        .2byte 0x00 # Section header size ; E_SHENTSIZE
        .2byte 0x00 # Number of section headers ; E_SHNUM
        .2byte 0x00 # Section header string table index ; E_SHSTRNDX

DECLARE_ELF_HEADER_SIZE:
.set ehdrsize,  DECLARE_ELF_HEADER_SIZE - ehdr

phdr:
        .4byte 0x01 # PT_LOAD ; P_TYPE
        .4byte 0x00 # located at offset 0??? ; P_OFFSET
        .8byte ehdr # ; P_VADDR
        .8byte ehdr # ; P_PADDR
        .8byte filesize # ; P_FILESIZE
        .8byte filesize # ; P_MEMSZ
        .8byte 5 # R-X ; P_FLAGS
        .8byte 0x1000 # P_ALIGN

DECLARE_PHEADER_SIZE:
.set phdrsize, DECLARE_PHEADER_SIZE - phdr

_start:
        mov eax, 60
        mov edi, 42
        syscall

DECLARE_FILE_SIZE:
.set filesize, DECLARE_FILE_SIZE - ehdr

Again, we create the ELF file for the above code using:

$ as --64 -o test.o test.s 
$ ld -nmagic -s test.o 

This time, we don't get any errors and the ELF file does what it is supposed to do (that is, exits with code 42). The ELF header for this file is:

ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              EXEC (Executable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x4000f0
  Start of program headers:          64 (bytes into file)
  Start of section headers:          0 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         1
  Size of section headers:           64 (bytes)
  Number of section headers:         0
  Section header string table index: 0

Problem solved, eh?

Unfortunately, I don't think so. Delving a bit deeper into the binary reveals that it does not use the hand-written ELF header that we had written in the .s file:

gef➤  search-pattern 'ELF'
[+] Searching 'ELF' in memory
[+] In '/home/VM/Desktop/Experiments/ELF/a.out'(0x400000-0x401000), permission=r-x
  0x400001 - 0x400004  →   "ELF[...]"
  0x400079 - 0x40007c  →   "ELF[...]"

Oops, it looks like ld (or as, I am not sure!) created their own ELF header (as is evident from the fact that there are 2 locations containing the string ELF. From what I understand, the ELF string at 0x400001 is the one that was contributed by ld (or as), and the one at 0x400079 was what I had included in the .s file.

However, this defeats the very own purpose of writing an ELF file from scratch, including the ELF header and program headers! I am also sure this problem is present in approaches 1 and 2 that I listed before, but since they anyway exited with a segmentation fault, I did not highlight this issue in those approaches.

Now, I am out of ideas on what to do, so I would highly appreciate any help for getting the ELF file correctly built!

Thanks a lot!


Solution

  • The message about the _start is a red herring. That is normally used to fill in the entry point but since you do that manually and are not using the linker to emit the ELF header it's irrelevant. You got the phdr wrong. In particular, the p_offset should be 8 bytes, the p_flags should be 4 bytes and placed as the second field. The fixed version is:

        .4byte 0x01 # PT_LOAD ; P_TYPE
        .4byte 5 # R-X ; P_FLAGS
        .8byte 0 # located at offset 0??? ; P_OFFSET
        .8byte ehdr # ; P_VADDR
        .8byte ehdr # ; P_PADDR
        .8byte filesize # ; P_FILESIZE
        .8byte filesize # ; P_MEMSZ
        .8byte 0x1000 # P_ALIGN