assembly x86-64 instructions machine-code

REX.W prefix on x86-64 MOV instruction with segment register

I am reading the x86/x64 developer's manual trying to understand instruction encoding and got confused by the mov from segment register, i.e. the 12th and 13th form on this page. There are two specific questions:

(1) Both are labeled with REX.W prefix, but to my understanding this flag is only used when operand size is 64 bit. I can see why it should be used with r64, but except for that, why are these two instructions labeled with REX.W prefix?

(2) There is a m16 destination in both instructions, isn't it a duplication? Why these two instructions are separated in the first place?

One reason I can think of for (2) is that in the first form, r16/r32 can be selected with 66H prefix, while 66H is ignored when REX.W exists (for r64). But the 66H doesn't seem to work for m16 either, why is that included in the second form (r64/m16)?

Solution

I believe that that is an editing error. I think that the REX prefix is supposed to be omitted from line 12, to contrast with the 16-bit form on line 11 and the 64-bit form on line 13 (just as there are 16- 32-, and 64-bit forms for other variants).

(Perhaps there was an attempt to combine the three forms into a single line. The "description" column for that entry says "Move zero extended 16-bit segment register to r16/r32/r64/m16", so that's consistent with someone starting to merge the rows and then realizing that the r64 row should be separate, but forgetting to remove the REX.W from the 16/32-bit row.)

I think the reason that m16 appears on all three forms is that a move from a segment register into memory is always 16 bits, regardless of the operand size. mov from SR is a weird instruction.

There is no reason to ever use the 64-bit form: all CPUs that support 64-bit mode zero-extended to the full register size with no prefixes.

The upper bits of the destination register are zero for ... and all Intel 64 processors.

The exceptions are 32-bit-only CPUs older than Pentium Pro, and Quark (based on P5) which is also 32-bit only.

I didn't check AMD manuals to see if there's a possibility that any AMD64 CPUs might leave the upper 6 bytes of RAX undefined or unmodified for 8c d8 (mov eax,ds with no prefixes). But Intel's manual is clear that all Intel64 CPUs will zero-extend to 32-bit (and thus implicitly to 64-bit like always when writing a 32-bit register).

The 66h operand size prefix can be used to encode 66 8c d8 (mov ax,ds) which leaves the upper bytes of RAX unmodified (like always for writing a 16-bit register).

Normally you'd never want this, but the operand-size prefix does affect mov reg, SR unlike REX.W.