Is there any faster method to store two x86 32 bit registers in one 128 bit xmm register?
movd xmm0, edx
movd xmm1, eax
pshufd xmm0, xmm0, $1
por xmm0, xmm1
So if EAX is 0x12345678
and EDX is 0x87654321
, the result in xmm0 must be 0x8765432112345678
.
With SSE 4.1 you can use movd xmm0, eax
/ pinsrd xmm0, edx, 1
and do it in 2 instructions.
For older CPUs you can use 2 x movd
and then punpckldq
for a total of 3 instructions:
movd xmm0, edx
movd xmm1, eax
punpckldq xmm0, xmm1