Search code examples
cassemblyx86osdev

How to print a string in x86 real-mode non-OS assembly


I am trying to implement a function which tries to print string in 16-bit mode seen on QEmu:
kernel.c file:

void main()
{
 char* str = "Hello World!";
 printString(str); 
}

The printString function is defined in another file printString.c:

int printString(char* string)
{ 
 int i = 0;
 while (*(string + i) != '\0')
   {
    char al = *(string + i);
    char ah = 0xe;
    int ax = ah * 256 + al;
    interrupt(0x10,ax,0,0,0);
    i++;
   }
 return i;
}

The interrupt function calls BIOS interrupt, as given in the first argument of the function with other arguments specifying respectively the contents of ax, bx, cx and dx registers. Here is the code

.global _interrupt
_interrupt:
push bp
mov bp, sp
push si
push ds
mov ax, #0x100
mov ds, ax
mov ax, [bp + 0x4]
mov si, #intr
mov [si + 1], al
pop ds
mov ax, [bp + 0x6]
mov bx, [bp + 0x8]
mov cx, [bp + 0xa]
mov dx, [bp + 0xc]
intr: int #0x0
pop si
pop bp
ret

I compile the .c files using the command:

bcc -ansi -c -o <name.o> <name.c>

And link them using:

ld86 -o kernel -d kernel.o interrupt.o printString.o

Instead of printing "Hello World!", the program prints "S" on the screen. I loaded the kernel.c file at the address 0x1000. I saw the dis-assembly of the code:

0x1000: push   %bp
0x1001: mov    %sp,%bp
0x1003: push   %di
0x1004: push   %si
0x1005: dec    %sp
0x1006: dec    %sp
0x1007: mov    $0xc8,%bx
0x100a: mov    %bx,-0x6(%bp)
0x100d: pushw  -0x6(%bp)
0x1010: call   0x1058 

For the pointer to be passed to the printString function (in kernel.c file) the argument passed is 0xc8 which contains 0xf000ff53. Thus 53, which is the ASCII code for S gets printed on the screen.
How should I pass the string to the printString function and why doesn't the above code works?
Please tell me if I need to give more explanation.


Solution

  • Explanation

    • I got your code to work after setting up the DS data segment register early in kernel.c with inline assembly.
    • You're loading your kernel image at 0x1000 and building your code so that it thinks it starts at address 0x0. Thus for data access to work, you need to setup the DS data segment register to add 0x1000 to memory addresses in data access instructions.
      • For example, to access the first byte of your "Hello World!" string located at offset 0xc8 of the kernel image, you need to access physical memory address 0x10c8. By setting DS to 0x100, an access to data memory at address 0xc8 is translated into an access to physical address $ds*0x10 + 0xc8 == 0x10c8, i.e. the address we want.
      • Read the first half of x86 segmentation for the details.
      • The memory access happens in the *(string + i) expression in printString() thus stepping over the assembler level instructions of the printString() loop in GDB should also help. You were focusing on the wrong code by disassembling main() because you didn't understand x86 segmentation.
    • I'm not sure how you're loading and running the code at address 0x1000 with Qemu. In my setup below, I'm loading the code via GDB and have a small boot sector that just jumps to that address.

    kernel.c

    void main()
    {
    #asm
        mov ax, #0x100
        mov ds, ax
    #endasm
        char* str = "Hello World!";
        printString(str); 
    }
    

    printString.c

    int printString(char* string)
    { 
     int i = 0;
     while (*(string + i) != '\0')
       {
        char al = *(string + i);
        char ah = 0xe;
        int ax = ah * 256 + al;
        interrupt(0x10,ax,0,0,0);
        i++;
       }
     return i;
    }
    

    interrupt.asm

    .global _interrupt
    _interrupt:
    push bp
    mov bp, sp
    push si
    mov ax, [bp + 0x4]
    mov si, #intr
    mov [si + 1], al
    mov ax, [bp + 0x6]
    mov bx, [bp + 0x8]
    mov cx, [bp + 0xa]
    mov dx, [bp + 0xc]
    intr: int #0x0
    pop si
    pop bp
    ret
    

    boot.asm

    .global _main
    _main:
    mov ax, #0x1000
    jmp ax
    

    bootsector-create

    #!/usr/bin/env python3
    # Create an x86 boot sector
    # Pad file to 512 bytes, insert 0x55, 0xaa at end of file
    
    import sys
    import os
    
    def program_name():
        return os.path.basename(sys.argv[0])
    
    def print_usage_exit():
        sys.stderr.write('usage: %s IN_FILENAME OUT_FILENAME\n' % (program_name(),))
        sys.exit(2)
    
    def main(args):
        try:
            (in_filename, out_filename) = args
        except ValueError:
            print_usage_exit()
    
        buf = bytearray(512)
        f = open(in_filename, 'rb')
        f.readinto(buf)
        buf[510] = 0x55
        buf[511] = 0xaa
        fout = open(out_filename, 'wb')
        fout.write(buf)
        fout.close()
    
    if __name__ == '__main__':
        main(sys.argv[1:])
    

    kernel.gdb

    set confirm 0
    set pagination 0
    set architecture i8086
    
    target remote localhost:1234
    set disassemble-next-line 1
    
    monitor system_reset
    delete
    
    restore kernel binary 0x1000
    continue
    

    GNUmakefile

    DERVED_FILES := kernel kernel.o interrupt.o printString.o boot boot.o bootsect
    
    .PHONY: all
    all: boot kernel
    
    bootsect: boot
        ./bootsector-create $< $@
    
    boot: boot.o
        ld86 -o $@ -s -d $+
    
    kernel: kernel.o interrupt.o printString.o
        ld86 -o $@ -s -d $+
    
    %.o: %.c
        bcc -ansi -c -o $@ $<
    
    %.o: %.asm
        as86 -o $@ $<
    
    .PHONY: clean
    clean:
        rm -f $(DERVED_FILES)
    
    .PHONY: emulator
    emulator: bootsect
        qemu-system-x86_64 -s -S bootsect
    
    .PHONY: gdb
    gdb:
        gdb -q -x kernel.gdb
    

    Sample Session

    $ make emulator
    (In a separate terminal)
    $ make gdb