I have an assembly code which loads a COM file to the memory and runs it. The COM file will be loaded to a separate data-segment, changes the DS and SS to that segment and calls it to run the COM file. My data and the Stack segments are:
STACKSEG SEGMENT STACK 'stack'
DW 512 DUP(?)
STACKSEG ENDS
DATASEG1 SEGMENT PARA 'data'
stacksp dw 0
stackbp dw 0
DATASEG1 ENDS
DATASEG2 SEGMENT PARA 'data'
WHERETOJ:
;stck db 32 dup (?)
txt db 4096 dup (?) , "$"
;string B
DATASEG2 ENDS
and my caller is:
mov stacksp, sp
mov stackbp, bp
ASSUME DS: DATASEG2, SS:DATASEG2
mov ax, DATASEG2
mov DS, AX
mov ES, AX
MOV SS, AX
mov SP, 0FFFFh
call far ptr WHERETOJ ; (Now registers are SP:FFFF, BP:0, SI:0C40, DI:0C16, DS:1820,ES:1820, SS:1820, CS:192D, IP:00BD)
ASSUME DS: DATASEG1, SS:STACKSEG
mov ax, STACKSEG
MOV SS, AX
MOV AX, DATASEG1
MOV DS, AX
MOV ES, AX
mov bx, offset stackbp
mov bp, [bx]
mov bx, offset stacksp
mov sp, [bx]
ret
The COM file just runs int 10h to print a character and should return to the caller:
mov ax, 0945h ; (Now registers are changed to SP:FFFB, BP:0, SI:0C40, DI:0C16, DS:1820,ES:1820, SS:1820, CS:1820, IP:0000)
mov bx, 0006
mov cx, 40
int 10h
ret ;; ( registers are SP:FFFB, BP:0, SI:0C40, DI:0C16, DS:1820,ES:1820, SS:1820, CS:1820, IP:000B)
;;ret is ran, the registers are: SP:FFFBD, BP:0, SI:0C40, DI:0C16, DS:1820,ES:1820, SS:1820, CS:1820, IP:00C2
The problem is that when COM
file is ran, the ret
does not return to the caller but to a wrong unknown IP.
Your primary issue is that the RET
you are performing is a NEAR return. A NEAR return in real mode will result in a 16-bit offset being popped off the stack and the IP (Instruction Pointer) being set to that value. The segment will not change.
Your code:
call far ptr WHERETOJ
Is a FAR CALL
and pushed the 16-bit Code Segment (CS) followed by the 16-bit IP. The NEAR RETURN only popped off the 16-bit IP and left the segment on the stack.
At the point of the FAR CALL you said the registers had:
SP:FFFF, SS:1820, CS:192D, IP:00BD
A CALL
pushes the CS followed by IP of the instruction after the CALL
. The FAR CALL is encoded as a 5 byte instruction so the address pushed on the stack is 192Dh:00C2h (00BDh+5=00C2h). When you did the NEAR return it didn't change CS but it changed IP to 00C2h. It only popped 2 bytes off the stack as well. This is why you saw this in the debugger when the RET
instruction was executed:
SP:FFFD, SS:1820, CS:1820, IP:00C2
SP was incremented by 2 from 0FFFBh to 0FFFDh. CS remained the same and IP was set to 00C2h. The CS:IP pair is incorrect so you ended up executing memory you didn't intend to. If you replace RET
with RETF
(FAR RETURN) then your code would have worked as expected and the registers would have had these values when RET
was executed:
SP:FFFF, SS:1820, CS:192D, IP:00C2
You use the term COM program, but your code suggests it is likely a binary with an ORG (origin point) of 0000h instead of a typical ORG of 0100h in DOS COM programs. To be compatible with DOS COM you have to load the code and data at an offset 256 bytes from the beginning of the Code Segment the program will be run from. In a DOS COM program, the first instruction executed is CS:0100h and not CS:0000h
In a typical DOS COM program the DOS loader pushes 0000h on the top of the stack. If you do a NEAR RET that will start executing at CS:0000h. The first 256 bytes contain the DOS Program Segment Prefix (PSP). The first 2 bytes of the PSP (and thus CS:0000h) are an INT 20H
instruction.
INT 20h
will terminate a DOS COM program and return an ERRORLEVEL of 0 to the DOS command prompt that launched the program. INT 20h
should not be used to exit DOS EXE programs, you use INT 21h/AH=4C
instead.
If your intention is to find a way to use a NEAR RETURN to exit your program like DOS does then you will have to provide a mechanism (code) inside the code segment to do that. Since you don't have a DOS PSP (or have chosen not to use one) you will have to find a place to copy such code within the segment. The easiest mechanism is to create a code trampoline on the stack before you start executing the program. The simplest way is to push a FAR CALL (or FAR JMP) onto the stack that takes you back to the instruction after the call far ptr WHERETOJ
instruction.
A FAR CALL is encoded on the stack as:
9A oooo ssss
Where 9A
is the opcode for a FAR CALL, oooo
is the offset to call, ssss
is the segment to use. We want to keep the stack pointer (SP) on an even alignment1 so we add a NOP
instruction for a total of 6 bytes (6 is an even number). The stack would look like:
90 9A oooo ssss
Once you have the NOP+FAR CALL built on the stack, you need to push the offset of that code so that a NEAR RET will end up calling it when executed.
The path of execution in the COM program will be NEAR RET (ret
) causing IP to change to the address on the stack where the NOP+FAR CALL is and execute that instruction to call back to the location of the FAR JMP used to start executing the COM program in the first place. You could have encoded a NOP+FAR JMP on the stack as well, but the NOP+FAR CALL has the advantage of pushing the value of CS on the stack which could be useful later on, especially if you load more than one COM program in memory.
A sample program written to run on 8086 or later processors could look like this:
.8086
STACKSEG SEGMENT STACK 'stack'
DW 512 DUP(?)
STACKSEG ENDS
DATASEG1 SEGMENT PARA 'data'
finstr db 0dh, 0ah, 'Returned from COM program', 0dh, 0ah, '$'
stacksp dw 0
stackbp dw 0
DATASEG1 ENDS
DATASEG2 SEGMENT PARA 'data'
WHERETOJ:
mov ax, 0945h
mov bx, 0057h
mov cx, 40
int 10h
ret
org 65536 ; Expand the segment to 64KiB
DATASEG2 ENDS
CODESEG1 SEGMENT PARA 'code'
main:
ASSUME DS: DATASEG1, SS:STACKSEG
mov [stacksp], SP
mov [stackbp], BP
ASSUME DS: DATASEG2, SS:DATASEG2
mov AX, DATASEG2
mov DS, AX
mov ES, AX ; DS=ES=SS=DATASEG2
; CLI ; If running on BUGGY 8088 you would need to have CLI/STI
MOV SS, AX
xor SP, SP ; SP = 0. Grow down from top of 64KiB SS segment
; STI
; Build FAR CALL on the COM programs stack
; to return to this code when NEAR RET done
push CS ; Put CS on stack as part of FAR CALL
mov AX, offset aftercom
; Push the IP of the instruction after the FAR JMP below
push AX
mov AX, 09a90h ; Put a NOP(90h) on the stack and 9AH (FAR CALL opcode)
push AX ; NOP used as padding to keep SP aligned on an even address
mov AX, SP
push AX ; Push a copy of SP on the stack. SP is the address of the
; NOP
; CALL FAR PTR segment:offset instruction built on stack
jmp far ptr WHERETOJ ; Start executing our program code
aftercom:
add SP, 10 ; When we return the stack has 10 bytes on it (6 bytes
; for FAR CALL and the NOP + 4 bytes of the CALLers
; IP and CS). Clean them up
ASSUME DS: DATASEG1, SS:STACKSEG
MOV AX, DATASEG1
MOV DS, AX
MOV ES, AX
mov AX, STACKSEG
; CLI ; If running on BUGGY 8088 you would need to have CLI/STI
MOV SS, AX ; Restore SS:SP one after another since interrupts
; will be off until the instruction after changing SS
mov SP, [stacksp]
; STI
mov BP, [stackbp]
mov AH, 09 ; Display a string saying we returned
mov DX, offset finstr
int 21h
mov AX, 4c00h ; Exit DOS EXE program with ERRORLEVEL 0
int 21h
CODESEG1 ENDS
END main
A properly functioning version of the code would look similar to this when run:
PUSH
that gets done will wrap SP to 0FFFEh (0000h-0002h=0FFFEh) before writing the value on the stack.