Search code examples
assemblyx86-16machine-codedosbox

Why does my machine code not behave as expected?


I am using the DOSBox debugger as an environment to explore how an x86/64 based processor traverses machine code.

As a reference i am using the "DOS2 length-delimited output" example i found at: https://montcs.bloomu.edu/~bobmon/Information/LowLevel/Assembly/hello-asm.html

I have tried several different approaches but this is what has produced the results closest to what i am looking for.

I am using a hex editor to enter the bytes manually and here is the hex-code i currently have saved in a file called "executable.com":

68 DD 01 1F B2 00 B6 00 B1 06 B3 01 B4 40 B0 00
CD 21 B4 4C B0 00 CD 21 48 65 6C 6C 6F 21 0A D0
0A 24 20

Executing this file through the debugger gives the following code overview:

01DD:0100  68DD01              push 01DD
01DD:0103  1F                  pop  ds
01DD:0104  B200                mov  dl,00
01DD:0106  B600                mov  dh,00
01DD:0108  B106                mov  cl,06
01DD:010A  B301                mov  bl,01
01DD:010C  B440                mov  ah,40
01DD:010E  B000                mov  al,00
01DD:0110  CD21                int  21
01DD:0112  B44C                mov  ah,4C
01DD:0114  B000                mov  al,00
01DD:0116  CD21                int  21

This is somewhat similar to the code in the link (which i have also tried of course) and it does print a string of length 6 as expected.
However, the string is not fetched from where i want and so the output is just a mess of characters as opposed to the "Hello!" that is present in the hex-code.

Any thoughts on what is going on?


Solution

  • I recreated the example using NASM as suggested by Peter Cordes which at first produced the exact same results as one of my previous attempts but when i added "org 0x100" to the beginning of my assembly source i got the result i was looking for.

    This essentially adds an offset to all addresses which is needed as the code is loaded into memory at address 0x100 as opposed to 0x00. In this example "org 0x100" only resulted in the change of one bit in the produced result but this one bit was the difference between reading from memory at the correct location and reading 256 bytes to early.

    This is how the machine code eventually turned out:

    BA 13 01 B9 06 00 BB 01 00 B8 00 40 CD 21 B8 00
    4C CD 21 48 65 6C 6C 6F 21
    

    And the assembly code used to produce it:

    org 0x100
    
    mov dx, msg
    mov cx, 0x06
    mov bx, 1
    mov ax, 0x4000
    int 0x21
    mov ax, 0x4C00
    int 0x21
    
    msg db "Hello!"