How are x86 instructions stored in memory?

How many bytes does a MOV instruction take up in memory? If a 4 byte address is used, with 1 byte for the move command itself, how are all 40 bits stored in memory?

From this website, https://www.cs.uaf.edu/2016/fall/cs301/lecture/09_28_machinecode.html

   0:   b8 05 00 00 00          mov    eax,0x5
   5:   b9 05 00 00 00          mov    ecx,0x5
   a:   ba 05 00 00 00          mov    edx,0x5

I was under the impression that memory was broken up into contiguous 2 or 4 byte address locations, and all instructions were stored in memory, so how can a 5 byte instruction fit?

Solution

The x86 instruction set was started years ago with an 8-bit processor, and an instruction set whose instruction encodings vary in length. The processor would fetch a byte at a time and begin to decode that, fetching more instruction bytes as needed if the instruction wasn't finished within the already fetched bytes. Nothing to it really, just a byte stream in memory.

Later processors both extended the instruction set as well as the memory access bus width so as to be able to fetch more than one byte at a time — however, they still needed to be able to fetch additional instruction bytes/words if the instruction it is decoding isn't completed in the bytes it already has. They still need to be able to fetch, decode, execute an instruction of not necessarily known length from an arbitrary byte address.

So, instructions on x86 are simply a byte stream — individual instructions can vary in length from 1 byte long to something like 13 bytes long, max (I think, with x64).

so how can a 5 byte instruction fit?

The instruction bytes are stored in memory quite normally, as a byte stream. So, a 5-byte instruction takes 5 consecutive bytes in memory. There nothing really to fit into except some sequence of bytes.

I was under the impression that memory was broken up into contiguous 2 or 4 byte address locations

Not sure where you got this notion, but it doesn't apply to the instruction set on x86, as instructions can be any byte length between 1 and, what is it, 13.

Yes, there is a notion of word size (e.g. 2-,4-,8-byte items) and alignment (and this can have a minimal effect on performance, but mostly doesn't matter). Some processors forgo the extra hardware it takes to handle unaligned word data, but x86 has always supported it, in part, having come from the 8-bit world where word boundaries are somewhat meaningless as far as the processor's interaction with memory, since it was always working just a single byte at a time through an 8-bit memory bus, even when working with word data, or working with the instruction stream consisting of instructions of varied lengths.

Other instruction sets, like MIPS choose a fixed length instruction format — thus on MIPS, all instructions are 4 bytes long. Further, all instructions must be aligned on 4 byte boundaries, so each instruction occupies one aligned memory word. This does simplify the hardware, so decoding on MIPS is easier, but the lack of variable length instructions does hamper instruction set longevity.

So, on RISC V, instructions vary in size from 2 bytes to 4 bytes (currently, though 6- and 8-byte instructions are allowed for in the encodings but AFAIK no such larger instructions have been defined or implemented). These instructions must be aligned on 2 byte boundaries, so stored in units of 2 bytes, though a 4 byte instruction is allowed to be half-word unaligned (i.e. when preceded by a 2 byte instruction).