I'm wondering if the following
mov eax, [ebx+4]
is equivalent to
add ebx, 4
mov eax, [ebx]
is there a performance improvement or a reason why the first should be preferred over the latter, except for a readability and code styling issue?
Although "equivalent", there are differences as noted below and there are several reasons to prefer the mov eax,[ebx+4]
over the add, mov
:
- The address calculation
ebx+4
is performed "for free" by the address calculation logic in the CPU
- It is only one instruction long, vs two
- It does not cause a dependency stall (depending on the CPU implementation) between the results of an ALU operation and a memory load
- It preserves the FLAGS register (condition code), the MOV doesn't alter EFLAGS whereas ADD does
- It is fewer bytes in memory
- It does not modify EBX, although depending on the context that might not be a good thing, maybe you want EBX to be updated?
- It frees up micro-architectural resources so that other instructions may execute in parallel with the MOV. Modern CPUs will execute the MOV (a load from memory) in a LOAD/STORE execution unit that can perform its own address calculation, and the ADD in a FIXED-POINT execution unit... if you do your address calculation in the MOV instruction this frees up one of the FIXED-POINT execution units to do something else in parallel.