I've been reading about endianess again, after a few months with MIPS. I'm a little confused when it comes to when it matters when loading/storing from/to memory, so can someone verify If my understanding is correct? I don't have a Big Endian machine to test it and for some reason can't get qemu to work.
Example 1:
lw $t0,word_
word_: .word 0xAABBCCDD // behaves the same in both Endian and Little Endian
Example 2:
lw $t0,bytearr_
bytearr_ : .byte 0xAB, 0xCD, 0xEF, 0xAA // either 0xABCDEFAA on BE or 0xAAEFCDAB on LE (?)
Example 3:
lhw $t0,b2hw
b2hw : .byte 0xAB, 0xCD //can this lead to issues as well? (LE is 0xCDAB, BE is 0xABCD)
Please correct me if I'm wrong or if I missing any potential conversion that could go wrong from one endianness to the other. Thanks!
EDIT:
What is going to happen in the case of LE/BE if I attempt to load a word into a halfword or a halfword into a word? For instance lw $t0, hw_
where hw_: .half 0xABCD
and lhw $t0, w_
where w_: .word 0xAABBCCDD
In Example 1 the assembler is taking care of how to arrange the word in memory.
If the architecture is little endian the lowest address will hold 0xDD
, then 0xCC
, then 0xBB
and then 0xAA
.
If the architecture is big endian it will be the other way round: first 0xAA
, then 0xBB
, then 0xCC
and then 0xDD
.
So when you issue lw $t0, word_
you get the value you expect (0xAABBCCDD
).
On your second example you are defining an array of bytes, so the assembler must obey your ordering. The lowest address will hold 0xAB
, then 0xCD
, then 0xEF
, then 0xAA
.
So when you issue lw $t0,bytearr_
you will get different results whether your architecture is little endian or big endian.
If your architecture is little endian you end up with $t0=0xAAEFCDAB
and if your architecture is big endian you end up with $t0=0xABCDEFAA
.
The third example is similar to the second. You define an array of bytes, so the lowest address will hold 0xAB
and then 0xCD
and issuing lhw $t0, b2hw
will end up with $t0=0xCDAB
if the architecture is little endian and $t0=0xABCD
if it is big endian.
If you wish to let the assembler manage the arrangement then you would use the directive .half
, like so:
lhw $t0,b2hw
b2hw : .half 0xABCD //let the assembler figure out how to arrange this half word in memory
Your final question about what happens when you "attempt to load a word into a halfword or a halfword into a word?".
The answer is that you really don't load a word into a halfword nor a halfword into a word. You load a word or a halfword starting at some address. So if you have the following example:
hw_: .half 0xABCD
w_: .word 0xAABBCCDD
this code:
lw $t0, hw_
will load a word starting at the address pointed by hw_
, and
lhw $t0, w_
will load a half word starting at the address pointed by h_
.
The arrangement in memory would be (from smaller addresses to larger ones):
if its little endian:
0xCD ; hw_
0xAB
0xDD ; w_
0xCC
0xBB
0xAA
so if you issue lw $t0, hw_
you would get 0xCCDDABCD
, and with lhw $t0, w_
you would get 0xCCDD
.
And if it was big endian:
0xAB ; hw_
0xCD
0xAA ; w_
0xBB
0xCC
0xDD
so if you issue lw $t0, hw_
you would get 0xABCDAABB
, and with lhw $t0, w_
you would get 0xAABB
.