Search code examples
assemblyx86alignmentsse

Must all 16 bytes of an x86 MASKMOVDQU instruction be valid memory?


When using the x86 MASKMOVDQU instruction, must there always be 16 bytes of writable memory at the target, even if some of the mask bits are zero?

For example, let's say that I write to address 0x12345FFC using MASKMOVDQU. The page at 0x12345000 is valid memory, but the page at 0x12346000 is not. If the mask register is 0x00000000'00000000'00000000'FFFFFFFF, will this MASKMOVDQU always work, or could an exception occur?

The Intel manual says the following about an all-zero mask, but doesn't mention the edge case I'm talking about:

Behavior with a mask of all 0s is as follows:

• No data will be written to memory.

• Signaling of breakpoints (code or data) is not guaranteed; different processor implementations may signal or not signal these breakpoints.

• Exceptions associated with addressing memory and page faults may still be signaled (implementation dependent).

• If the destination memory region is mapped as UC or WP, enforcement of associated semantics for these memory types is not guaranteed (that is, is reserved) and is implementation-specific.


Solution

  • See the third bullet point. That specifically says exception may still occur, even if all masks are zero. Surely that implies exception may be generated for masked writes.

    Indeed, the AMD manual is more clear on this issue:

    Exception and trap behavior for elements not selected for storage to memory are implementation dependent. For instance, a given implementation may signal a data breakpoint or a page fault for bytes that are zero-masked and not actually written.