I usually write portable C code and try to adhere to strictly standard-conforming subset of the features supported by compilers.
However, I'm writing codes that exploits the ARM v8 Cryptography extensions to implement SHA-1 (and SHA-256 some days later). A problem that I face, is that, FIPS-180 specify the hash algorithms using big-endian byte order, whereas most ARM-based OS ABIs are little-endian.
If it's a single integer operand (on general purpose register) I can use the APIs specified for the next POSIX standard, but I'm working with SIMD registers, since it's where ARMv8 Crypto works.
So Q: how do I swap the byte order for words in a vector register on ARM? I'm fine with assembly answers, but prefer ACLE intrinsics ones.
The instructions are:
REV16
for byte-swapping short integers,REV32
for byte-swapping 32-bit integers, andREV64
for byte-swapping 64-bit integers.They can be used to swap the byte AND word order of any length that's strictly less than what their name indicates. They're defined in section C7.2.219~C7.2.221 of Arm® Architecture Reference Manual Armv8, for A-profile architecture "DDI0487G_b_armv8_arm.pdf"
e.g. REV32
can be used to reverse the order of 2 short integers within each 32-bit words:
[00][01][02][03][04][05][06][07]
to
[02][03][00][01][06][07][04][05]
Their intrinsics are defined in a separate document: Arm Neon Intrinsics Reference "advsimd-2021Q2.pdf"
To swap the 32-bit words in a 128-bit vector, use the vrev32q_u8
instrinsic. Relevant vreinterpretq_*
intrinsics need to be used to re-interpret the type of the operands.