I'm currently studying encryption and I was coming up with ways of reversing the nibbles of a byte (ex. 0xF5
=> 0x5F
). I came up with this solution:
byte >> 4 | (byte & 0x0F) << 4
Other solutions I found online were similar, but added one extra operand by masking the left nibble of a byte:
(byte & 0xF0) >> 4 | (byte & 0x0F) << 4
In binary, the second solution looks like this:
# Extract the right nibble of a byte and shift to the left
[0xF5] 1111 0101 # Original Value
[0x0F] 0000 1111 & # Mask Right Nibble
[0x05] 0000 0101 = # Extracted Right Nibble
[0x50] 1010 0000 << 4 # Shift Four Bits to the Left
# Extract the left nibble of a byte and shift to the right
[0xF5] 1111 0101 # Original Value
[0xF0] 1111 0000 & # Mask Left Nibble
[0xF0] 1111 0000 = # Extracted Left Nibble
[0x0F] 0000 1111 >> 4 # Shift Four Bits to the Right
# Combine the shifted nibbles together
[0x05] 0000 1111 # Left Nibble Shifted to the Right
[0xF0] 0101 0000 | # Right Nibble Shifted to the Left
[0xF5] 0101 1111 = # New Value
Correct me if I'm wrong, but the second solution is useful if you're dealing with a data type that is larger than one byte and are solely focused on the least significant byte. So if you were to shift without masking, the bits from the higher-ordered bytes will propagate into the left nibble. Thus, masking the left nibble is necessary in regards to that scenario.
# Bits shift into the least significant byte without masking
[0x0AF5] 0000 1010 1111 0101
[0x00AF] 0000 0000 1010 1111 >> 4
On the other hand, if you're truly working with one byte, isn't masking the left nibble redundant as the left nibble bits will get zeroed out after right-shifting four bits?
[0xF5] 1111 0101
[0x0F] 0000 1111 >> 4
Maybe there are other reasons to mask the left nibble bits that I'm unaware of. For example, other systems might behave differently where the second solution is necessary.
Do I have my head in the right place or is there anything I should consider?
Here is an example code for further clarification:
typedef uint8_t byte;
static inline
byte swap_nibbles(byte bits) {
return bits >> 4 | (bits & 0x0F) << 4;
}
You do not show the definition of byte
. If it has a signed eight-bit integer, then this code:
signed char byte = -111; /* 0x91 */
printf("0x%hhX\n", byte);
byte = byte >> 4 | (byte & 0x0F) << 4;
printf("0x%hhX\n", byte);
prints, in many C implementations:
0x91 0xF9
Although byte
initially contains the bits 9116, in byte >> 4
, it is promoted to int
. Since those bits represent the value −111, this produces an int
with that value, which is, with a four-byte int
, FFFFFF9116. Then >> 4
produces FFFFFF916. (This is implementation-defined.) ORing this with the 1016 from the right side of the |
produces FFFFFF916, and then assigning that to byte
converts it to signed char
. That is implementation-defined, but the most common result is to wrap modulo 256, producing the bits F916, representing the value −7.
In contrast, using byte = (byte &0xF0) >> 4 | (byte & 0x0F) << 4;
prints:
0x91 0x19
However, if byte
has an unsigned eight-bit integer type, then the promotion to int
is not a problem, as int
has enough headroom to hold the values in these expressions without running into sign or overflow issues.
Modern compilers may be heavily optimized for bit-twiddling operations, so it is likely either of these expressions will be well optimized for the target architecture. Your first concern should be correctness.