I did a search on Stack Overflow, and I have not found anything similar to my problem. My problem is this: I have a code that opens a file and writes a message at the end. When I use int 21h to write to the file in the first time, it writes well if the file is empty, but if the file has content, the program adds to the end many trash bytes (characters like 畂 or another japanese or chinese characters).
I have checked that the program don't write more bytes than the message length. Please, help me. Here is my source code:
.model tiny
.code
main:
call delta
delta:
pop bp
sub bp, offset delta
mov ax, @code ;Get the address of code segment and store it in ax
mov ds, ax ;Put that value in Data segment pointer.
;Now, we can reference any data stored in the code segment
;without fail.
;Subroutines
open:
mov ax, 3D02H ;Opens a file
lea dx, [bp+filename];Filename
int 21h ;Call DOS interrupt
mov handle, ax ;Save the handle in variable
move_pointer_to_end:
mov bx, handle
mov ax,4202h ; Move file pointer
xor cx,cx ; to end of file
cwd ; xor dx,dx
int 21h
write:
mov ax, 4000H
mov bx, handle
lea dx, [bp+sign]
mov cx, 16
int 21H
exit:
mov ah,4Ch ;Terminate process
mov al,0 ;Return code
int 21h
datazone:
handle dw ?
filename db 'C:\A.txt', 0
sign db 'Bush was here!!', 0
end main
Please help me!!
That's because the file to which you're appending the data is encoded in unicode. If you write a file out from Notepad or another text editor and save it, you have to pick ANSI as the encoding. Then if you point your program at the ANSI encoded text file, it should append the string indicated with the expected result.
Unicode allocates two bytes for every character so in a hex editor you might see s.o.m.e.t.h.i.n.g. .l.i.k.e. .t.h.i.s.
rather than something like this
that you might expect for ANSI or UTF-8.