Search code examples
assemblyx86endianness

What happens when 4-byte mov is used to load multiple words?


I have the following code (using AT&T syntax for x86):

x: .word 1
y: .word 0
z: .word 2
...
mov x, %eax # eax = 0x1

How is x moved into %eax?

I know that mov moves 4 bytes because the destination is 4-byte %eax, so an l suffix is optional.

I know x, y, and z are stored next to each other in memory since I put those source lines together in the same section, so I expect it would by something like this:

x       y       z
|       |       |
00 01 | 00 00 | 00 02

And mov will take the first 4 bytes: 00 01 00 00. So, I expected eax to have in reverse order the first 4 bytes since x86 uses little endian encoding -> eax=0x00 00 01 00 which is not the 0x1 I get from actually trying it. (e.g. in GDB, with print /x $eax after single-stepping past the mov.)

What am I doing wrong?


Solution

  • Also, x, y, z I know that they are stored next to each other in memory, so I expect it would by something like this:

    x       y       z
    |       |       |
    00 01 | 00 00 | 00 02
    

    The order of these bytes is wrong!
    Being little endian the assembler stored the values like:

    x       y       z
    |       |       |
    01 00 | 00 00 | 02 00
    

    Taking a full dword at X therefore finds 01 00 00 00 which equals 00000001h.


    With that order, mov will take the first 4 bytes from right to left? Like dcba? 01 (a) 00 (b) | 00 (c) 00 (d)

    If you like to think about it that way. Executing a dword-sized mov, the CPU fetches a 4-byte chunk of memory and the CPU just knows that the first byte needs to go in the lowest byte of %EAX and working towards the highest byte that receives the fourth byte.