How to calculate the total size of a TSR block from multiple .asm files?

After 23 years since the last time I did something in assembly I'm now writing a DOS TSR program just for the fun of it.

I had a rather big source file and I decided to split it into smaller .asm files. The problem is that I'm having problems to calculate the size of the TSR block. With a single .asm file I just did something like:

mov ax, offset end_of_code_label - offset start_of_code_label

but now that I have portions of code scattered among several source files this approach won't work.

I've found that playing with the linker would help, by manually specifiying the link order and making sure that the last .obj file is the one with the "end_of_code_label".

Is there an elegant way to do this, or at least something that wouldn't be regarded as an ugly hack?

Solution

The simplest way to control the order of things is to put everything in segments according to where they need to be in the final program and then use a "template" assembly file that you link first to order the segments. By default the linker orders segments in the order it encounters them, so if you have all the segments used in your program in the first file it sees then that file determine the order of the segments.

The fact that your TSR is supposed to be a single-segment COM file complicates things, but it's still possible to use multiple assembler segments. Assembler segments don't necessarily have to correspond one to one with the segments your program uses. In this case you would just use assembler segments to group things together.

As an example you could use a template file like this:

    EXTERN  init_start:NEAR
    PUBLIC  resident_end

PSPSEG  GROUP   RTEXT, REND, ITEXT

RTEXT   SEGMENT PUBLIC PARA 'RESIDENT'
    ORG 100h
start:
    jmp init_start
RTEXT   ENDS

REND    SEGMENT PUBLIC BYTE 'REND'
resident_end:
REND    ENDS

ITEXT   SEGMENT PUBLIC BYTE 'INIT'
ITEXT   ENDS

    END start

If you put all your resident code in the RTEXT section and the initialization code in the ITEXT section then the former code will be put at the start of the program and the later code at the end. The symbol resident_end will be right in the middle, separating the code that needs to be kept in memory after the program exits from the code that doesn't.

The purpose of the GROUP directive is to inform the assembler and linker that RTEXT, REND, and ITEXT are all supposed to be one actual segment. This is important because they need to know that any addresses used should be relative to the group PSPSEG, the actual segment the program is using. Note that the GROUP directive creates what amounts to be an alias, it doesn't actually have any effect on the order of things in the linked program.

Because COM programs always start executing at the beginning of the program, I've put code in this template that jumps to the initialization code at the end. If you were creating an EXE you wouldn't need this, you would leave RTEXT empty and would use just END instead of END start. The file with the entry point would use END with the entry point's label instead.

Here's a minimal TSR that is designed to be linked with the above template:

    EXTERN  resident_end:NEAR
    PUBLIC  init_start

PSPSEG  GROUP   RTEXT, ITEXT

RTEXT   SEGMENT PUBLIC PARA 'RESIDENT'
    ASSUME  DS:NOTHING, SS:NOTHING, CS:PSPSEG
old_handler DD 0cccccccch
interrupt_handler:
    jmp [old_handler]
RTEXT   ENDS

ITEXT   SEGMENT PUBLIC BYTE 'INIT'
    ASSUME  DS:PSPSEG, SS:PSPSEG, CS:PSPSEG
init_start:
    mov ax, 3508h
    int 21h        ; get old timer interrupt handler
    mov WORD PTR [old_handler], bx
    mov WORD PTR [old_handler + 1], es
    mov dx, OFFSET interrupt_handler
    mov ax, 2508
    int 21h        ; set new timer interrupt handler

    mov ax, 3100h
    mov dx, OFFSET resident_end + 15
    shr dx, 4
    int 21h        ; terminate and stay resident
ITEXT   ENDS

    END

It's necessary to repeat the GROUP directive in each file, though you only have to list the segments used in the file. This code was written and tested (link-only) with a modern version of MASM, if you're using TASM you may need to explicitly tell the assembler to make offsets relative to PSPSEG each time you use OFFSET. For example mov dx, OFFSET PSPSEG:interrupt_handler.