Search code examples
assemblyx86-64disassemblymachine-codeavx512

How Are Registers X/Y/ZMM16-31 Encoded?


As I understand, since AVX, trough the 3-Byte VEX or EVEX prefix, you can encode up to 32 XMM/YMM/ZMM registers in 64-bit mode, but when looking trough the Intel manual past the fact that it tells you that is possible, I cannot find the part where it explains how that actually occurs. There is only one extension field that I can see, which is the REX inverted fields, but not anything else, aside from a special place in the EVEX prefix to encode mask registers.

You would need 2 bits somewhere to encode that many registers. Do you have to combine 2 of the inverted REX fields inside the VEX/EVEX prefixes somehow, or how does this process work?


Solution

  • xmm16..31 (and their ymm/zmm equivalents) are new with AVX-512 and only accessible via EVEX prefixes, which have 2 extra bits to add to each of the ModRM fields, and a 5 more bits as an extra field for the third operand.

    REX + legacy-SSE, and VEX for AVX1/2 encodings, can only access xmm/ymm0..15.

    Wikipedia's EVEX article has a pretty good table that shows where the bits come from, which I transcribed some of:

    Addr mode Bit 4 Bit 3 Bits [2:0] Register type
    REG EVEX.R' EVEX.R ModRM.reg General Purpose, Vector
    RM EVEX.X EVEX.B ModRM.r/m GPR, Vector
    NDS/NDD EVEX.V' EVEX.v3 EVEX.v2v1v0 Vector
    Base 0 EVEX.B SIB.base (or modrm) GPR
    Index 0 EVEX.X SIB.index GPR

    If the R/M operand is a vector register instead of a memory addressing mode, it uses both the X (index) and B (base) bits as extra register-number bits. Because that means there's no SIB.index field which could also need extension to select r8..r15.


    In REX and VEX prefixes, The X bit goes unused when the source operand isn't memory with an indexed addressing mode. (https://wiki.osdev.org/X86-64_Instruction_Encoding#REX_prefix, but note in a register-number table earlier in that page showing X.Reg, X is just a placeholder for R or B, not REX.X; confusing choice on that page).

    See also x86 BSWAP instruction REX doesn't follow Intel specs? for another diagram of using an extra register-number bit from a REX prefix.