Search code examples
assemblyx86dosx86-16tasm

set the alignment of the data segment in TASM ideal mode


My assembly source code:

ideal 
model tiny 
segment _data byte  ; TASM doesn't accept it. 
ends _data 
 
dataseg 
align 1  ; Doesn't decrease the segment alignment. 
lpText  db "Hello, world!$" 
 
codeseg 
        startupcode 
        lea dx,[lpText] 
        mov ah,9 
        int 21h 
        exitcode 
end 

TASM 5.0 gives me the error Segment attributes illegally redefined for the line with byte in it.

How do I change the data segment to byte alignment in ideal mode?

I need it because I don't want an extra 0 byte in the generated .com file in front of lpText. I want the .com file to be as small as possible.


Solution

  • If you intend to reopen a segment like _DATA then the segment must match the previously declared segment attributes. When you use the simplified DOS segments and the model directive the default is WORD alignment for the _TEXT and _DATA segments. You have changed the alignment from the default of WORD to BYTE and thus are getting the error you see. The TASM 5 manual states:

    Segment attributes illegally redefined

    A SEGMENT directive reopen a segment that has been previously defined, and tries to give it different attributes. For example:

    DATA SEGMENT BYTE PUBLIC
    DATA ENDS
    DATA SEGMENT PARA                 ; error, previously had byte alignment
    DATA ENDS
    

    If you reopen a segment, the attributes you supply must either match exactly or be omitted entirely. If you don't supply any attributes when reopening a segment, the old attributes will be used.


    You have a couple of choices. One is the simpler one and that is to combine the code and data in the code segment:

    ideal
    model tiny
    
    codeseg
            startupcode
            lea dx,[lpText]
            mov ah,9
            int 21h
            exitcode
    lpText:  db "Hello, world!$"
    
    end
    

    If you were to assemble this it should generate a program with a size of 25 bytes:

    00000100  BA0B01            mov dx,0x10b
    00000103  B409              mov ah,0x9
    00000105  CD21              int 0x21
    00000107  B44C              mov ah,0x4c
    00000109  CD21              int 0x21
    0000010B  48                dec ax
    0000010C  656C              gs insb
    0000010E  6C                insb
    0000010F  6F                outsw
    00000110  2C20              sub al,0x20
    00000112  776F              ja 0x183
    00000114  726C              jc 0x182
    00000116  642124            and [fs:si],sp
    

    An alternative is to not use the model directive and declare your own segments from scratch:

    ideal
    group DGROUP _DATA, _TEXT
    
    segment _TEXT byte 'CODE'
    org 100h
    ends
    segment _DATA byte 'DATA'
    ends
    
    segment _DATA
    lpText  db "Hello, world!$"
    ends
    
    segment _TEXT
    _start:
        lea dx,[lpText]            ; or mov dx, offset lptext
        mov ah,9
        int 21h
        ret                        ; COM programs that use TINY model
                                   ; can exit with a RET. DOS places 0000h
                                   ; on the stack when program starts. Returning
                                   ; to 0000h executes an INT 20h instruction at
                                   ; offset 0000h in the PSP
    ends
    
    end _start
    

    The startupcode and exitcode directives don't work outside the simplified segment model so you need to generate the code yourself. Since I am assuming you are using tiny model to generate DOS COM programs you can use ret to return from DOS assuming you don't need to return an error level. This reduces the size of the program. There is no need to set the segment registers as CS=DS=ES=SS are all pointing at the Program Segment Prefix (PSP) when DOS starts running the COM program. The program generated would look like this:

    00000100  BA0801            mov dx,0x108
    00000103  B409              mov ah,0x9
    00000105  CD21              int 0x21
    00000107  C3                ret
    00000108  48                dec ax
    00000109  656C              gs insb
    0000010B  6C                insb
    0000010C  6F                outsw
    0000010D  2C20              sub al,0x20
    0000010F  776F              ja 0x180
    00000111  726C              jc 0x17f
    00000113  642124            and [fs:si],sp
    

    The resulting COM program should be a size of 22 bytes.


    If you had replaced ret in the previous code with:

    mov ah,4ch
    int 21h
    

    The resulting file would be a size of 25 bytes and more importantly the _DATA segment is not aligned on a WORD boundary:

    00000100  BA0B01            mov dx,0x10b
    00000103  B409              mov ah,0x9
    00000105  CD21              int 0x21
    00000107  B44C              mov ah,0x4c
    00000109  CD21              int 0x21
    0000010B  48                dec ax
    0000010C  656C              gs insb
    0000010E  6C                insb
    0000010F  6F                outsw
    00000110  2C20              sub al,0x20
    00000112  776F              ja 0x183
    00000114  726C              jc 0x182
    00000116  642124            and [fs:si],sp