I just started learning ARM assembly. I am currently on a 32-bit Raspian with "GNU assembler version 2.35.2 (arm-linux-gnueabihf)".
This is my simple program to load part of ascii into a register :
.global _start
_start:
ldr r1,=helloworld
ldr r2,[r1]
@prepare to exit
mov r0,#0
mov r7,#1
svc 0
.data
helloworld:
.ascii "HelloWorld"
I loaded it into gdb and can see that my register r2 loads 0x6c6c6548
(in ascii "lleH"). A quick objdump shows :
Contents of section .data:
0000 48656c6c 6f576f72 6c64 HelloWorld
I have below questions :
.word
is 0x12345678 instead of 0x78563412 ? Why there is no endianess followed?Note : .word
used instead of .ascii
.global _start
_start:
ldr r1,=helloworld
ldr r2,[r1]
mov r0,#0
mov r7,#1
svc 0
.data
helloworld:
.word 0x12345678
EDIT
The memory dump for first program shows that even the memory has string in same order as in the source code and the object file :
>>> x/32xb 0x1008c
0x1008c: 0x48 0x65 0x6c 0x6c 0x6f 0x57 0x6f 0x72
0x10094: 0x6c 0x64 0x41 0x11 0x00 0x00 0x00 0x61
This indicates that the ldr
instruction is converting that memory read into little endian format where LSB holds the first byte in memory. Is the understanding correct? But this still does not answer why this did not happen for a .word
.
Endianess or byte order is the order in which the bytes comprising a number are represented in memory.
A string is an array of bytes. Each byte of this string is subject to endianess, but for a single byte, little and big endian come out to the same thing.
For your second question: endianess only affects data while being stored in memory. The assembler gives you a human readable representation of the computer program. The token 0x12345678
represents a certain number. When transferred to memory, this token will be written to memory in the appropriate byte order. The assembler takes care of this.
You will also see the register content as 0x12345678
when watching the execution of your program in a debugger. This is because registers are not part of memory and are not divided into bytes. Each register holds a 32 bit number. The CPU transfers data between registers and memory in the configured byte order (see the SETEND
instruction) And without the register being divided into bytes, there is no meaningful way to assign a byte order to it. The debugger can only show you its numeric value. And this just comes out to be the value you assigned to it in your program. Crazy how this works, eh?