I'm new to MIPS and I'm just wondering, I store a space character in the following ways:
li $t0, ' '
lb $t1, ' '
la $t2, myArray # load array
sb $t0, 0($t2) # myArray[0] = ' '
In this case is $t0 == $t1? And is the sb instruction valid? What I'm a bit confused is whether or not I can use byte and ints(words) interchangeably.
The byte
vs word
is not freely interchangeable, because byte is only 8 bits of information, and word
is 32 bits of information (on MIPS platform). So the byte
can be set to 256 different possible values (28 combinations of eight 0/1 bit values), and word
can be set to 2564 different possible values (pattern of 32 bits).
You need four bytes to store the same possible-amount of information like what you can fit into single word (8 bits * 4 = 32 bits).
But depending on the values you are processing, if you can guarantee their ranges, you can predict how the code converting values between byte/half-word/word will behave, whether some values will survive such conversions without any damage, or it needs extra validation/handling. For example if your input values are ASCII characters (from string), then those are only 7 bit (when interpreted as signed integers, only values 0 to +127 are defined in ASCII).
So for example li $t0, ' '
will assemble as li $t0, 32
(because the "space" character is encoded in computer as value 32) and because the li
instruction takes as operant signed integer immediate.
Actually the "li" is not real MIPS instruction, but a convenience pseudo-instruction, the assembler will convert it into one/two native instructions to encode/compose the desired immediate value. Try for example li $t0, ...
with values +1, -1, +65000, -65000 and watch in debugger how it gets assembled into different native instructions, achieving the desired "load immediate" effect, for example the -65000 value needs at least two native instructions to be composed.
So you are technically loading 32 bit (word) value into $t0 (even if the ' '
is only value 32
which fits easily into byte).
But as you know you did load the ASCII "space" into t0
, no matter the t0
is 32 bits "wide", you know it is enough to store only "byte" into memory, if you are for example creating new string in buffer, and you want to put space character into it. So then sb $t0, 0($t2)
is correct. Would you have some larger value in t0
, the upper 24 bits are ignored, and only low 8 bits of that value are written into memory with sb
instruction (effectively "truncating" that value in the memory, it's not possible to read back from memory the full value, only the truncated part).
The conversion in other direction will happen often too in MIPS assembly, because for example lb
will read only 8 bits from memory, but it will sign-extend them into full register (32 bits). If you don't pay attention to your values, you may easily trap yourself like for example this:
.data
test_value: .byte 234
.text
li $t0, 234
lb $t1, test_value
tne $t0, $t1 # throw exception if t0 is not equal to t1
# terminate normally when values are equal
li $v0, 10
syscall
This may look at first read as there's value 234 compared against 234, and thus the program will normally terminate, but if you will try to run it, it will instead end with exception at the tne
instruction. Because the lb
does sign-extend the value and 234
fits in 8 bits only when you interpret that bit pattern as "unsigned 8 bit integer", if you interpret the same bit pattern as "signed 8 bit integer", it becomes value -22
. And -22
does not equal 234
.
Would you change the lb
instruction to lbu
, which loads "unsigned byte", the code will work and exit normally, as the tne
will compare 234 vs 234 values, which are equal.
So while programming in assembly, you should be definitely aware of type of data you process, and correctly extend/truncate those values as needed.
(BTW the MARS assembler will warn you about "234" not fitting "signed byte" and possible truncation - but values up to 255 actually do fit 8 bits, just need to be interpreted in "unsigned" way .. values above 255 will get truly truncated, like some bits are completely missing, for example .byte 1025
will store in memory only value 1
)