Les instruction purpose?

What is purpose of les instruction in assembly?

Why do we need to load es segment and a register? Book gives following example:

les    bx, p           ; Load p into ES:BX
mov    es:[bx], al     ; Store away AL

Why do we need to load es and bx in this case?

Also why do we use es:[bx]? If p points to 100h in memory, isn't both es and bx 100h = 200h (bx+es)?

Solution

Its too bad you are learning assembler for a microprocessor with a messy architecture. You get confusing concepts such as the LES instruction.

Conventional microprocessor have registers large enough to contain a full memory address. You can simply load the address of a memory location into a register, and then access that location (and usually those nearby with indexing) via the register.

Some machines (notably the Intel 286 in real mode, which seems to be what you are programming), had only 16 bit registers but could address 1MB of memory. In this case, a register doesn't have enough bits: you need 20 bits, but the registers are only 16 bits.

The solution is to have a second register that contains the missing bits. A simple scheme would have been to require 2 registers, one of which had the lower 16 bits, one of which had the upper 16 bits, to produce a 32 bit address. Then the instruction that references two registers makes sense: you need both to get a full memory address.

Intel chose a messier segment:offset scheme: the normal register (bx in your case) contains the lower 16 bits (the offset), and the special register (called ES) contains 16 bits which are left-shifted 4 bits, and added to the offset, to get the resulting linear address. ES is called a "segment" register, but this will make no sense unless you go read about the Multics operating system circa 1968.

(x86 allows other addressing modes for the "effective address" or "offset" part of an address, like es:[bx + si + 1234], but always exactly one segment register for a memory address.)

[Segments and segment registers really are an interesting idea when fully implemented the Multics way. If you don't know what this is, and you have any interest in computer and/or information architectures, find the Elliot Organick book on Multics and read it cover to cover. You will be dismayed at what we had in the late 60s and seem to have lost in 50 years of "progress". If you want a longer discussion of this, see my discussion on the purpose of FS and GS segment registers ]

What's left of the idea in the x86 is pretty much a joke, at least the way it it used in "modern" operating systems. You don't really care; when some hardware designer presents you with a machine, you have to live with it as it is.

For the Intel 286, you simply have to load a segment register and an index register to get a full address. Each machine insturction has to reference one index register and one segment register in order to form a full address. For the Intel 286, there are 4 such segment reigsters: DS, SS, ES, and CS. Each instruction type explicitly designates an index register and implicitly chooses one of the 4 segment registers unless you provide an explicit override that says which one to use. JMP instructions use CS unless you say otherwise. MOV instructions use DS unless you say otherwise. PUSH instructions use SS unless you say otherwise (and in this case you better not). ES is the "extra" segment; you can only use it by explicitly referencing it in the instruction (except the block move [MOVB} instruction, which uses both DS and ES implicitly).

Hope that helps.

Best to work with a more modern microprocessor, where segment register silliness isn't an issue. (For example, 32-bit mode x86, where mainstream OSes use a flat memory model with all segment bases = 0. So you can just ignore segmentation and have single registers as pointers, only caring about the "offset" part of an address.)