Search code examples
assemblyvirtualboxnasmx86-16bootloader

Boot loader doesn't jump to kernel code


I'm writing small operation system - for practice. I started with bootloader.
I want to create small command system that runs in 16 bit real mode (for now).
I've created bootloader that resets drive, then loads sector after bootloader.
The problem is because after jmp function nothing actually happening.

I't trying to load next sector at 0x7E00 (I'm not totally sure how to point address using es:bx so that may be a problem, I believe that its Address:offset), just after bootloader.

This is the code:

;
; SECTOR 0x0
;

;dl is number of harddrive where is bootloader
org 0x7C00
bits 16

;reset hard drive
xor ah,ah
int 0x13
;read sectors
clc
mov bx,0x7E00
mov es,bx
xor bx,bx
mov ah,0x02 ;function
mov al,0x1  ;sectors to read
mov ch,0x0  ;tracks
mov cl,0x1  ;sector
mov dh,0x0  ;head
int 0x13
;if not readed jmp to error
jc error
;jump to 0x7E00 - executed only if loaded
jmp 0x7E00
error:
    mov si,MSGError
    .loop:
        lodsb
        or al,al
        jz .end
        mov ah,0x0E
        int 0x10
        jmp .loop
    .end:
        hlt
MSGError db "Error while booting", 0x0
times 0x1FE - ($ - $$) db 0x0
db 0x55
db 0xAA

;
; SECTOR 0x1
;

jmp printtest
;definitions
MSGLoaded db "Execution successful", 0x0
;
; Print function
; si - message to pring (NEED TO BE FINISHED WITH 0x0)

printtest:
    mov si,MSGLoaded
    .loop:
        lodsb
        or al,al
        jz .end
        mov ah,0x0E
        int 0x10
        jmp .loop
    .end:
        hlt

times 0x400 - ($-$$) db 0x0

I've been testing this code using VirtualBox but nothing actually happens, The read error doesn't shows, as well as message that should be printed.


Solution

  • The primary problems with this code were:

    1. ES:BX was pointing to the wrong segment:offset to load the kernel into
    2. Wrong sector was being loaded so kernel wasn't what was expected

    The first one was in this code:

    mov bx,0x7E00
    mov es,bx
    xor bx,bx
    

    The question wants to load the sector from disk to 0x0000:0x7E00(ES:BX). This code sets the ES:BX to 0x7E00:0x0000 which resolves to a physical address of 0x7E000 ((0x7E00<<4)+0x0000). I think the intention was to load 0x07E0 into ES which would yield a physical address of 0x7E00 ((0x07E0<<4)+0x0000). You can learn more about 16:16 memory addressing calculations here. Multiplying the segment by 16 is the same as shifting it left 4 bits.

    The second problem in the code is here:

    mov ah,0x02 ;function
    mov al,0x1  ;sectors to read
    mov ch,0x0  ;tracks
    mov cl,0x2  ;sector number
    mov dh,0x0  ;head
    int 0x13
    

    The number for the second 512 block sector on the disk is 2, not 1. So to fix the above code you need to set CL accordingly:

    mov cl,0x2  ;sector number
    

    General Tips for Bootloader Development

    Other issues that can trip up running code on various emulators, virtual machines and real physical hardware that should be addressed are:

    1. When the BIOS jumps to your code you can't rely on CS,DS,ES,SS,SP registers having valid or expected values. They should be set up appropriately when your bootloader starts. You can only be guaranteed that your bootloader will be loaded and run from physical address 0x00007c00 and that the boot drive number is loaded into the DL register.
    2. Set SS:SP to memory that you know won't conflict with the operation of your own code. The BIOS may have placed its default stack pointer anywhere in the first megabyte of usable and addressable RAM. There is no guarantee as to where that is and whether it will be suitable for the code you write.
    3. The direction flag used by lodsb, movsb etc could be either set or cleared. If the direction flag is set improperly SI/DI registers may be adjusted in the wrong direction. Use STD/CLD to set it to the direction you wish (CLD=forward/STD=backwards). In this case the code assumes forward movement so one should use CLD. More on this can be found in an instruction set reference
    4. When jumping to a kernel it is generally a good idea to FAR JMP to it so that it properly sets CS:IP to expected values. This can avoid problems with kernel code that may do absolute near JMPs and CALLs within the same segment.
    5. If targeting your boot loader for 16-bit code that works on 8086/8088 processors (AND higher) avoid usage of 32 bit registers in assembly code. Use AX/BX/CX/DX/SI/DI/SP/BP instead of EAX/EBX/ECX/EDX/ESI/EDI/ESP/EBP. Although not an issue in this question, it has been an issue for others seeking help. A 32 bit processor can utilizes 32 bit registers in 16-bit real mode, but an 8086/8088/80286 can't since they were 16 bit processors without access to extended 32 bit registers.
    6. FS and GS segment registers were added to 80386+ CPUs. Avoid them if you intend to target 8086/8088/80286.
    7. Note: This is a very common issue asked about on Stackoverflow: If you are intending to boot as USB media using floppy disk emulation (FDD) on real hardware you may need to have a BIOS Parameter Block (BPB) in your boot sector. You can find more detailed information in this related Stackoverflow answer which provides an example BPB and a tool to see if your BIOS overwrites data in the BPB after loading your boot sector into memory.

    To resolve the first and second item this code can be used near the start of the boot loader:

    xor ax,ax      ; We want a segment of 0 for DS for this question
    mov ds,ax      ;     Set AX to appropriate segment value for your situation
    mov es,ax      ; In this case we'll default to ES=DS
    mov bx,0x8000  ; Stack segment can be any usable memory
    
    cli            ; Disable interrupts to circumvent bug on early 8088 CPUs
    mov ss,bx      ; This places it with the top of the stack @ 0x80000.
    mov sp,ax      ; Set SP=0 so the bottom of stack will be @ 0x8FFFF
    sti            ; Re-enable interrupts
    
    cld            ; Set the direction flag to be positive direction
    

    A couple things to note. When you change the value of the SS register (in this case via a MOV) the processor is supposed to turn off interrupts for that instruction and keep them off until after the following instruction. Normally you don't need to worry about disabling interrupts if you update SS followed immediately by an update of SP. There is a bug in very early 8088 processors where this wasn't honored so if you are targeting the widest possible environments it is a safe bet to explicitly disable and re-enable them. If you don't intend to ever work on a buggy 8088 then the CLI/STI instructions can be removed in the code above. I know about this bug first hand with work I did in the mid 80s on my home PC.

    The second thing to note is how I set up the stack. For people new to 8088/8086 16-bit assembly the stack can be set a multitude of ways. In this case I set the top of the stack (lowest part in memory) at 0x8000(SS). I then set the stack pointer (SP) to 0. When you push something on the stack in 16-bit real mode the processor first decrements the stack pointer by 2 and then places a 16-bit WORD at that location. Thus the first push to the stack would be at 0x0000-2 = 0xFFFE (-2). You'd then have an SS:SP that looks like 0x8000:0xFFFE . In this case the stack runs from 0x8000:0x0000 to 0x8000:0xFFFF.

    When dealing with the stack running on an 8086(doesn't apply to 80286,80386+ processors) it is a good idea to set the stack pointer (SP) to an even number. On the original 8086 if you set SP to an odd number you would incur a 4 clock cycle penalty for every access to stack space. Since the 8088 had an 8 bit data bus this penalty didn't exist, but loading a 16-bit word on 8086 took 4 clock cycles whereas it took 8 clock cycles on the 8088 (two 8 bit memory reads).

    Lastly, If you want to explicitly set CS:IP so that CS is properly set by the time the JMP is complete (to your kernel) then it is recommended to do a FAR JMP (See Operations that affect segment registers/FAR Jump). In NASM syntax the JMP would look like this:

    jmp 0x07E0:0x0000
    

    Some (ie MASM/MASM32) assemblers don't have direct support to encode a FAR Jmp so one way it can be done is manually like this:

    db 0x0ea     ; Far Jump instruction
    dw 0x0000    ; Offset
    dw 0x07E0    ; Segment
    

    If using GNU assembler it would look like:

    ljmpw $0x07E0,$0x0000