The code below works fine if complied for 32 bit (with the applicable register renaming). But it throws an error when executed (and "Warning: Object file "project1.o" contains 32-bit absolute relocation to symbol ".data.n_tc_p$project1_orbitkeyheader64$int64$longint$$int64_shufidx". " when compiled).
function SwapBytes64(const Val: Int64): Int64;
{$A 16}
const
SHUFIDX : array [0..1] of Int64 = ($0001020304050607, 0);
begin
asm
movq xmm0, rcx
pshufb xmm0, SHUFIDX // throws
movq rax, xmm0
end;
end;
How do I rectify this (ideally aligning the constant).
EDIT I also tried using movdqu.
ANSWER This is a result of @Jester's answer:
function SwapBytes64(const Val: Int64): Int64;
const
SHUFIDX : array [0..1] of Int64 = ($0001020304050607, 0);
begin
asm
movq xmm0, rcx
movdqu xmm1, [rip+SHUFIDX]
pshufb xmm0, xmm1
movq rax, xmm0
end;
end;
This works too, but there is no apparent speed benefit:
function SwapBytes64(const Val: Int64): Int64;
const
SHUFIDX : array [0..1] of Int64 = ($0001020304050607, 0);
begin
asm
movq xmm0, rcx
pshufb xmm0, [rip+SHUFIDX]
movq rax, xmm0
end;
end;
It might not be an alignment issue at all. The compiler has given you warning that your absolute reference to SHUFIDX
will be truncated to 32 bits. If the address is not within the first 4GiB, that will result in a wrong memory reference. You should check this in a debugger.
As a workaround, you should use rip-relative or indirect addressing. The former could look like movdqu xmm1, [rip+SHUFIDX]
or movdqu xmm1, rel SHUFIDX
or something similar. Consult your compiler's manual.