I have read that STOSB
functions like this:
ES:[DI] <-- AL
If DF = 0
increment DI
else decrement DI
.
STOSB
doesn't change DS:[DI]
? So why STOSB doesn't change DS:[DI]
Because this definition would collide with the use of LODSB
which already uses DS:[SI]
. Using a separate segment register gives you more flexibility.
Is there a special purpose for using extra segment?
Yes. You can transfer bytes between segments easily while processing them. For example, you can use a LODSB
loading AL
from DS:[SI]
, modify AL
and then store it to a different segment, the Extra Segment, with STOSB
using ES:[DI]
. In 8086 with its 16-bit segment size and a 20-bit address space this is really useful.
Another instruction illustrating the use is the REP MOVSB
instruction which copies a sequence of bytes (with its length in CX
) from DS:[SI]
to ES:[DI]
.
(If you don't need to examine each byte as you copy it, you'd simply use rep movsb
or rep movsw
for better performance than lods
/stos
in a loop.)
In most string instructions we use extra segment. why?
Well, not in most, but maybe roughly in half of them. Using another segment register gives you the advantage of quick access to different segments - you are not limited to processing data only in one 64KB segment and do not have to change the DS
register before each access to a different segment.
stos
and movs
write es:[di]
, which makes sense because DI
is the "destination index" register.
cmps
and scas
read from es:[di]
, which is maybe surprising for scas
because it only has one memory operand so you might expect it to read from ds:[si]
like lods
. Especially because SCASB sets flags from AL - [mem]
, not the other way around, so it's like a cmp
where memory is the right operand (source), not left (destination). Like cmp al, es:[di]
.
Perhaps the architect of the 8086's instruction set imagined a use-case of a loop that does lods
and scas
to implement strcmp
between segments.