I'm writing some Real Mode code and using 32 bit registers to do so (using the 0x66 prefix).
I've been looking through Intel's manuals, and cannot find the information I am looking for. See: http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html (I have skimmed Volume 1 Chapters 1-7, as well as specific instructions from Volume 2)
Does Intel guarantee specific behavior for the following code in real mode code? Is it the same as protected mode code?
mov eax, <some constant>
mov ebx, <some constant>
add ax, bx ; Are the top bits of ax zero'd, sign extended or left?
mov ax, <some constant> ; Does this leave top 16bits unchanged?
; From what I can tell, the top 16bits are unchanged, but where is this documented?
Note: I am not after how specific implementations act (ie code that checks it -- unless every implementation has always acted the same), just where Intel has documented this behavior.
Related: x86_64 registers rax/eax/ax/al overwriting full register contents
How this is different: This question relates specifically to Real Mode operation, and whether the observations from the linked question are valid in Real Mode.
Can anyone help me find where I can find this documented for real mode code?
The fact that 16-bit and 8-bit alternate register names only access their respective subparts is more or less documented in section 3.4.1 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1: Basic Architecture:
As shown in Figure 3-5, the lower 16 bits of the general-purpose registers map directly to the register set found in the 8086 and Intel 286 processors and can be referenced with the names AX, BX, CX, DX, BP, SI, DI, and SP. Each of the lower two bytes of the EAX, EBX, ECX, and EDX registers can be referenced by the names AH, BH, CH, and DH (high bytes) and AL, BL, CL, and DL (low bytes).
Since the documentation doesn't describe these alternate registers as accessing anything other than the indicated parts of the register, there's no reason to assume that they would. Also note that section 3.4.1 applies to all operating modes of the processor except 64-bit mode, so it includes real mode as well.
Section 3.4.1.1 covers what happens in 64-bit mode, which where the behaviour described in the post you linked comes from:
When in 64-bit mode, operand size determines the number of valid bits in the destination general-purpose register:
- 64-bit operands generate a 64-bit result in the destination general-purpose register.
- 32-bit operands generate a 32-bit result, zero-extended to a 64-bit result in the destination general-purpose register.
- 8-bit and 16-bit operands generate an 8-bit or 16-bit result. The upper 56 bits or 48 bits (respectively) of the destination general-purpose register are not modified by the operation. If the result of an 8-bit or 16-bit operation is intended for 64-bit address calculation, explicitly sign-extend the register to the full 64-bits.
Notably the 8-bit and 16-bit alternate registers work the same way in 64-bit modes as they do in other modes.
Finally even though the processor won't be implicitly erasing the upper 16-bits of the registers when you don't expect it, you can't necessarily depend on the environment you're executing in to not do this. Under MS-DOS it wasn't always safe to use upper 16-bits of the 32-bit registers, because you never knew when some call, interrupt service routine or TSR would change them. The various calling conventions and interfaces only defined what 16-bit registers were preserved and modified, they rarely if ever mentioned what happened to the upper 16-bits.