Search code examples
assemblyx86x86-16osdevwatcom

Qemu and Raw Binary File


I am compiling and running binaries (boot sector, stage 1, stage 2) for practice. The boot sector is asm and the first stage is asm which run fine. The second stage loads at 0x1000 and I have some asm which jumps to the start of my C Code. My jumps and calls seem to be off (short) by two bytes.

I have tried the code in Bochs and Qemu (stepping through it). All the code looks good. I have even disassembled it in IDA and every looks good. I assume it might be my lack of code alignment knowledge.

The 2nd stage starts at 0x1000:

0x1000: cli    
0x1001: xor    eax,eax
0x1003: mov    eax,0x1f1a
0x1008: mov    esp,eax
0x100a: sti    
0x100b: jmp    0x1010

The first jump lands at 0x1010 (this is disassembled C code):

0x1010: push   0x16b4
0x1015: call   0x14ca   <---
0x101a: add    esp,0x4
0x101d: jmp    0x101d

The call above to 0x14CA actually lands at 0x000014c9, two bytes short.

As in the above code, I expect the jump or call to land at the operand address, but it always misses short by two bytes.


Solution

  • This is a wild guess that may actually be wrong. It is based on the fact that in 32-bit code the relative JMP and CALL instructions you encoded are 5 bytes and in 16-bit code they are 3 bytes. 5 bytes - 3 bytes = 2 bytes. Given that relative JMP and CALL targets are based on the distance from the start of the next instruction it may offer a hint as to what might have gone wrong.

    If I take this code:

    bits 32
    org 0x1000
    
        cli
        xor    eax,eax
        mov    eax,0x1f1a
        mov    esp,eax
        sti
        jmp    0x1010
        push   0x16b4
        call   0x14ca
        add    esp,0x4
        jmp    0x101d
    

    And assemble it with:

    nasm -f bin stage2.asm -o stage2.bin
    

    And review the 32-bit decoding with:

    ndisasm -b32 -o 0x1000 stage2.bin
    

    I get:

    00001000  FA                cli
    00001001  31C0              xor eax,eax
    00001003  B81A1F0000        mov eax,0x1f1a
    00001008  89C4              mov esp,eax
    0000100A  FB                sti
    0000100B  E900000000        jmp dword 0x1010
    00001010  68B4160000        push dword 0x16b4
    00001015  E8B0040000        call dword 0x14ca
    0000101A  83C404            add esp,byte +0x4
    0000101D  E9FBFFFFFF        jmp dword 0x101d
    

    This looks correct. If however I decode the same code as 16-bit with:

    ndisasm -b16 -o 0x1000 stage2.bin
    

    I get:

    00001000  FA                cli
    00001001  31C0              xor ax,ax
    00001003  B81A1F            mov ax,0x1f1a
    00001006  0000              add [bx+si],al
    00001008  89C4              mov sp,ax
    0000100A  FB                sti
    0000100B  E90000            jmp word 0x100e
    0000100E  0000              add [bx+si],al
    00001010  68B416            push word 0x16b4
    00001013  0000              add [bx+si],al
    00001015  E8B004            call word 0x14c8
    00001018  0000              add [bx+si],al
    0000101A  83C404            add sp,byte +0x4
    0000101D  E9FBFF            jmp word 0x101b
    00001020  FF                db 0xff
    00001021  FF                db 0xff
    

    The instruction decoding is incorrect however the JMPs and CALLs are present and go to the wrong memory locations. This looks awfully like the observations you are seeing.

    Without seeing your code I hope that by the time you start executing stage 2 at 0x1000 that you have entered 32-bit protected mode. If you haven't then I suspect that is the root of your problems. I believe 32-bit encoded instructions are executing in 16-bit real mode.


    Update

    From the comments the OP suggests they entered 32-bit protected mode as part of the process of entering unreal mode. They had the belief that unreal mode would still decode instructions as 32-bit code and thus the problem.

    You get into unreal mode by entering 32-bit protected mode and return to 16-bit real mode. Unreal mode is still 16-bit real mode with the exception that the limits in the hidden descriptor cache are set to 0xffffffff (4GiB limit). Once returning to 16-bit real mode you'll be able to directly address memory in segments beyond 64KiB using 32-bit addressing, but the code is still running in 16-bit real mode.

    If you are writing code for 16-bit unreal mode your compiler and assembler still need to generate 16-bit code. If you intend to write/generate 32-bit code then unreal mode isn't an option and you will need to enter 32-bit protected mode to execute 32-bit code.