Search code examples
assemblyx86bootloadergnu-assemblerxv6

Passing Constant String Value to Register


I am rewriting the boot sector for the xv6 OS as an assignment and am attempting to execute a simple frame that does not output correctly to QEMU.

This is using the QEMU simulator (system i386) with the Linux Subsystem for Windows (using Ubuntu 18.04.1 LTS). The system does correctly display a character when passing a literal to %al, and enters the subsequent dead loop.

.code16
.globl start

start:
cli
xorw %ax, %ax
movw %ax, %ds
movw %ax, %es
movw %ax, %ss

movb $0x0e, %ah
movb hello, %al
movb $0, %bh
movb $7, %bl
int $0x10

stop:
jmp stop

hello:
.string "Hello world."

.org 0x1fe
.word 0xAA55

I expect an output of H but all that is printed is S; it does not output anything at all in most other cases, unless a literal is used.


EDIT: Here is the disassembly of the binary using objdump after building:

bootsector.img:     file format elf64-x86-64


Disassembly of section .text:

0000000000000000 <start>:
   0:    fa                       cli
   1:    31 c0                    xor    %eax,%eax
   3:    8e d8                    mov    %eax,%ds
   5:    8e c0                    mov    %eax,%es
   7:    8e d0                    mov    %eax,%ss
   9:    b4 0e                    mov    $0xe,%ah
   b:    a0 00 00 b7 00 b3 03     movabs 0x10cd03b300b70000,%al
  12:    cd 10 

0000000000000014 <stop>:
  14:    eb fe                    jmp    14 <stop>

0000000000000016 <hello>:
  16:    48                       rex.W
  17:    65 6c                    gs insb (%dx),%es:(%rdi)
  19:    6c                       insb   (%dx),%es:(%rdi)
  1a:    6f                       outsl  %ds:(%rsi),(%dx)
  1b:    20 77 6f                 and    %dh,0x6f(%rdi)
  1e:    72 6c                    jb     8c <hello+0x76>
  20:    64 2e 00 00              fs add %al,%cs:(%rax)
    ...
 1fc:    00 00                    add    %al,(%rax)
 1fe:    55                       push   %rbp
 1ff:    aa                       stos   %al,%es:(%rdi)

Steps I used to build and execute the code:

$ as bootsector.S -o bootsector.img
$ objcopy -O binary bootsector.img
$ qemu-system-i386 bootsector.img -curses

Solution

  • The problem is that you are assembling your code to an object file and you are converting the object to a binary file directly. You need to add a linking process to produce an executable from the ELF object and have it converted to binary. The linking step also specifies the origin point of 0x7c00.

    What you need to do is:

    • Assemble the boot sector with AS (GNU assembler) to an ELF object. I recommend outputting as ELF32 instead of ELF64.
    • Use LD (GNU Linker) to link the ELF object to an ELF executable or to a binary file directly. Use LD to specify an origin point of 0x7c00.
    • Place the boot sector in an image file.

    The commands you can use are:

    as boot.s -o boot.o
    ld -Ttext=0x7c00 --oformat=binary boot.o -o boot.bin
    

    boot.bin should be the final binary and should be exactly 512 bytes. You would normally place the boot sector in a disk image but for test purposes you should be able to run it directly with QEMU with:

    qemu-system-i386 -fda boot.bin
    

    If you dump the binary file using 16-bit instruction decoding starting with an origin point of 0x7c00 with the command:

    ndisasm -b16 -o 0x7c00 boot.bin
    

    You should get output that is similar to:

    00007C00  FA                cli
    00007C01  31C0              xor ax,ax
    00007C03  8ED8              mov ds,ax
    00007C05  8EC0              mov es,ax
    00007C07  8ED0              mov ss,ax
    00007C09  B40E              mov ah,0xe
    00007C0B  A0167C            mov al,[0x7c16]
    00007C0E  B700              mov bh,0x0
    00007C10  B307              mov bl,0x7
    00007C12  CD10              int 0x10
    00007C14  EBFE              jmp short 0x7c14
    00007C16  48                dec ax
    00007C17  656C              gs insb
    00007C19  6C                insb
    00007C1A  6F                outsw
    00007C1B  20776F            and [bx+0x6f],dh
    00007C1E  726C              jc 0x7c8c
    00007C20  642E0000          add [cs:bx+si],al
    00007C24  0000              add [bx+si],al
    00007C26  0000              add [bx+si],al
    00007C28  0000              add [bx+si],al
    
    [snip for brevity]
    
    00007DFA  0000              add [bx+si],al
    00007DFC  0000              add [bx+si],al
    00007DFE  55                push bp
    00007DFF  AA                stosb
    

    Alternative way to Build

    You can also use AS to assemble to an object file, use LD to link to an ELF, and then use OBJCOPY to convert the ELF executable to a binary file. This also allows you to use the ELF executable for symbolic debugging using remote GDB etc. You can also use OBJDUMP instead of NDISASM to see the generated code.

    The sequence of commands would be:

    as boot.s -o boot.o
    ld -Ttext=0x7c00 boot.o -o boot.elf
    objcopy -O binary boot.elf boot.bin
    

    Now you can use OBJDUMP to dump boot.elf. Be aware that you need to specify that you want to decode as 16-bit code. The OBJDUMP command would be:

    objdump -Dx -Mi8086 boot.elf
    

    The output would appear similar to this (if using a 64-bit toolchain as you appear to be using):

    boot.elf:     file format elf64-x86-64
    boot.elf
    architecture: i386:x86-64, flags 0x00000112:
    EXEC_P, HAS_SYMS, D_PAGED
    start address 0x0000000000007c00
    
    Program Header:
        LOAD off    0x0000000000000000 vaddr 0x0000000000007000 paddr 0x0000000000007000 align 2**12
             filesz 0x0000000000000e00 memsz 0x0000000000000e00 flags r-x
        LOAD off    0x00000000000010e8 vaddr 0x00000000004000e8 paddr 0x00000000004000e8 align 2**12
             filesz 0x0000000000000020 memsz 0x0000000000000020 flags r--
        NOTE off    0x00000000000010e8 vaddr 0x00000000004000e8 paddr 0x00000000004000e8 align 2**3
             filesz 0x0000000000000020 memsz 0x0000000000000020 flags r--
    
    Sections:
    Idx Name          Size      VMA               LMA               File off  Algn
      0 .note.gnu.property 00000020  00000000004000e8  00000000004000e8  000010e8  2**3
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      1 .text         00000200  0000000000007c00  0000000000007c00  00000c00  2**0
                      CONTENTS, ALLOC, LOAD, READONLY, CODE
    SYMBOL TABLE:
    00000000004000e8 l    d  .note.gnu.property     0000000000000000 .note.gnu.property
    0000000000007c00 l    d  .text  0000000000000000 .text
    0000000000000000 l    df *ABS*  0000000000000000 boot.o
    0000000000007c16 l       .text  0000000000000000 hello
    0000000000007c14 l       .text  0000000000000000 stop
    0000000000007c00 g       .text  0000000000000000 _start
    0000000000008000 g       .text  0000000000000000 __bss_start
    0000000000008000 g       .text  0000000000000000 _edata
    0000000000008000 g       .text  0000000000000000 _end
    
    
    
    Disassembly of section .note.gnu.property:
    
    00000000004000e8 <.note.gnu.property>:
      4000e8:       04 00                   add    $0x0,%al
      4000ea:       00 00                   add    %al,(%rax)
      4000ec:       10 00                   adc    %al,(%rax)
      4000ee:       00 00                   add    %al,(%rax)
      4000f0:       05 00 00 00 47          add    $0x47000000,%eax
      4000f5:       4e 55                   rex.WRX push %rbp
      4000f7:       00 01                   add    %al,(%rcx)
      4000f9:       00 00                   add    %al,(%rax)
      4000fb:       c0 04 00 00             rolb   $0x0,(%rax,%rax,1)
      4000ff:       00 01                   add    %al,(%rcx)
      400101:       00 00                   add    %al,(%rax)
      400103:       00 00                   add    %al,(%rax)
      400105:       00 00                   add    %al,(%rax)
            ...
    
    Disassembly of section .text:
    
    0000000000007c00 <_start>:
        7c00:       fa                      cli
        7c01:       31 c0                   xor    %eax,%eax
        7c03:       8e d8                   mov    %eax,%ds
        7c05:       8e c0                   mov    %eax,%es
        7c07:       8e d0                   mov    %eax,%ss
        7c09:       b4 0e                   mov    $0xe,%ah
        7c0b:       a0 16 7c b7 00 b3 07    movabs 0x10cd07b300b77c16,%al
        7c12:       cd 10
    
    0000000000007c14 <stop>:
        7c14:       eb fe                   jmp    7c14 <stop>
    
    0000000000007c16 <hello>:
        7c16:       48                      rex.W
        7c17:       65 6c                   gs insb (%dx),%es:(%rdi)
        7c19:       6c                      insb   (%dx),%es:(%rdi)
        7c1a:       6f                      outsl  %ds:(%rsi),(%dx)
        7c1b:       20 77 6f                and    %dh,0x6f(%rdi)
        7c1e:       72 6c                   jb     7c8c <hello+0x76>
        7c20:       64 2e 00 00             fs add %al,%cs:(%rax)
            ...
        7dfc:       00 00                   add    %al,(%rax)
        7dfe:       55                      push   %rbp
        7dff:       aa                      stos   %al,%es:(%rdi)
    

    Observations

    You may be curious to know why in your output you got strange movabs instruction:

    bootsector.img:     file format elf64-x86-64
    
    Disassembly of section .text:
    
    0000000000000000 <start>:
       0:    fa                       cli
       1:    31 c0                    xor    %eax,%eax
       3:    8e d8                    mov    %eax,%ds
       5:    8e c0                    mov    %eax,%es
       7:    8e d0                    mov    %eax,%ss
       9:    b4 0e                    mov    $0xe,%ah
       b:    a0 00 00 b7 00 b3 03     movabs 0x10cd03b300b70000,%al
      12:    cd 10
    

    OBJDUMP doesn't know that the code is 16-bit (that information isn't retained in an ELF object file). OBJDUMP defaulted to 64-bit decoding because the object file format was elf64-x86-64 (ELF64). By default AS on a 64-bit toolchain produced 64-bit objects. OBJDUMP will default to 64-bit decoding of ELF64 files and that was what gives the incorrect decoding. You can use -Mi8086 to request 16-bit decoding with OBJDUMP.


    Other Recommendations

    • Consider renaming start to _start to keep the linker from giving a warning about not being able to find the entry point. The warning isn't fatal and can be ignored. The alternative is to tell LD that the entry point is start by adding the extra option --entry=start. The command could look like:

      ld -Ttext=0x7c00 --entry=start boot.o -o boot.elf