Should all code compiled for 32 bit machines be in 4 byte chunks?

I have a simple 32-bit assembly code that I wrote:

movl  $0x542412e6, %eax
movl  %ebp , %edx
addl  $0x30, %edx
movl  %edx, %ebp
pushl 0x08048dd6
ret

When I run this command:

gcc -m32 -c e.s

I get the following 18 bytes:

0:  b8 e6 12 24 54          mov    $0x542412e6,%eax
5:  89 ea                   mov    %ebp,%edx
7:  83 c2 30                add    $0x30,%edx
a:  89 d5                   mov    %edx,%ebp
c:  68 d6 8d 04 08          push   $0x8048dd6
11: c3                      ret

Why is the object code 18 bytes and not 20 or 16? Shouldn't it always be in 4-byte words for a 32-bit machine?

Solution

Instruction size does not related to data or address bus size. Some 16-bit x86 CPUs have 3 totally different sizes with 8-bit data bus, 20-bit address bus and variable length instruction size. Modern 32-bit or 64-bit x86 have variable length instruction too for backward compatibility.

Just look at the movl $0x542412e6, %eax and pushl 0x08048dd6 line and you'll see that it's impossible to encode 32-bit immediate data, opcode and register within 32 bits of data. If an architecture uses 32-bit fixed-length instruction then it must use multiple instructions or a literal pool to load 32-bit constant.

RISC architectures often have fixed width instructions as a trade-off between code density and decoder simplicity. But 32-bit RISC architectures with instruction size different from 32-bit also exist. For example MIPS16e and ARM thumb v1 have 16-bit instructions whereas ARM thumb2 and dalvikVM have variable length instructions. Modern 64-bit RISC architectures also won't have 64-bit instructions but rather often stick with the 32-bit size