Considering a pared-down example of down-casting unsigned
to unsigned char
,
void unsigned_to_unsigned_char(unsigned *sp, unsigned char *dp)
{
*dp = (unsigned char)*sp;
}
The above C code is translated to assembly code with gcc -Og -S
as
movl (%rdi), %eax
movb %al, (%rsi)
For what reason is the C-to-assembly translation not as below?
movb (%rdi), %al
movb %al, (%rsi)
Is it because this is incorrect, or because movl
is more conventional, or shorter in encoding, than is movb
?
Writing to an 8 bit x86 register possibly incurs an extra merge µop when the new low byte is merged with the old high bytes of the corresponding 32/64 bit register. This can also cause an unexpected data dependency on the previous value of the register.
For this reason, it is generally a good idea to only write to 32/64 bit variants of general purpose registers on x86.