What is the difference between the ret instruction in x86 and x64?

I was recently trying out a stack overflow exercise on x64. When performing this on x86, I would expect the following for a junk overwrite address (e.g. 'AAAA'):

The data I provide overflows the buffer, and overwrites the return address
Upon ret, the (overwritten) return address will be (effectively) popped into the EIP register
It is realised that the address is not valid, and a segmentation fault is raised

In x64, this seems different (beyond the interchange of EIP with RIP in the above steps). When providing a junk address of 'AAAAAAA', the processor seems to do some validity checking before popping the address. By observation, it seems required that the two most significant bytes of the address are null, before it is loaded. Otherwise, a segfault occurs. I believe this is due to the use of 48-bit addressing in x64, however I was under the impression that addresses starting with 0xFFFF were also valid, yet this also produces a segfault.

Is this an accurate description of the difference? Why is this check performed before the data is loaded into the RIP register, whilst the other validity check is performed afterwards? Are there any other differences between these instructions?

EDIT: To clarify my observations, I note that when a 8-byte return address is provided, the RIP still points to the address of the ret instruction, and the RSP still points to the overwritten return address on segfault. When an 6-byte return address is provided, the overwritten address has been popped into the RIP when the segfault is observed.

Solution

Interesting that RSP doesn't get updated before the fault. So it's not code-fetch from a non-canonical address that faults, it's the ret instruction's attempt to set RIP to a non-canonical address.

That makes the whole RET instruction fault, meaning that none of its effects are visible. (Because Intel's manual doesn't define any partial-progress / stuff updated even on fault behaviour for ret.)

Unfortunately the Operation section for ret in Intel's manual is a rats nest of conditionals because they use one block to document near and far, and every combination of mode and operand-size. Plain ret in 64-bit mode is "IA-32e mode", operand-size=64, and "near" (not changing CS to a different code segment, just changing RIP).

In that case, x86-64 normal ret is basically pop rip.
32-bit mode normal ret is basically pop eip.
Nothing more, nothing less. RIP = *RSP++.