Search code examples
gccx86cpucpu-architecture

How do I force the CPU to perform in order execution of a program without any loops or branches?


Is it possible? For a small code without any branches/loops. Are there any gcc flags or intrinsic instructions like SSE's for x86 and other processor families? I am just curious since all the processors available these days follow out of order execution model.


Solution

  • Most modern out-of-order CPUs are inherently out-of-order, without switching possible between in-order and out-of-order modes.

    You can try to find some in-order CPU, and there are some:

    • x86: Intel Atom (only 45 nm and older versions; they have two parallel pipelines but executes all instructions in order)
    • arm: Cortex-A8, and many older cores;

    While it is not possible to directly turn off instruction reordering in the typical out-of-order CPU, you can inject something serializing (like cpuid in x86 world) between every your instruction to simulate in-order execution.

    There is a part of Intel manuals (vol 3a) about serializing instructions (copied from http://objectmix.com/asm-x86-asm-370/69413-serializing-instructions.html):

    Volume 3A: System Programming Guide states

    7.4 SERIALIZING INSTRUCTIONS

    The Intel 64 and IA-32 architectures define several serializing instructions. These instructions force the processor to complete all modifications to flags, registers, and memory by previous instructions and to drain all buffered writes to memory before the next instruction is fetched and executed. For example, when a MOV to control register instruction is used to load a new value into control register CR0 to enable protected mode, the processor must perform a serializing operation before it enters protected mode. This serializing operation insures that all operations that were started while the processor was in real-address mode are completed before the switch to protected mode is made.

    The concept of serializing instructions was introduced into the IA-32 architecture with the Pentium processor to support parallel instruction execution. Serializing instructions have no meaning for the Intel486 and earlier processors that do not implement parallel instruction execution.

    It is important to note that executing of serializing instructions on P6 and more recent processor families constrain speculative execution because the results of speculatively executed instructions are discarded. The following instructions are serializing instructions:

    o Privileged serializing instructions - MOV (to control register, with the exception of MOV CR8), MOV (to debug register), WRMSR, INVD, INVLPG, WBINVD, LGDT, LLDT, LIDT, and LTR.

    o Non-privileged serializing instructions - CPUID, IRET, and RSM.

    When the processor serializes instruction execution, it ensures that all pending memory transactions are completed (including writes stored in its store buffer) before it executes the next instruction. Nothing can pass a serializing instruction and a serializing instruction cannot pass any other instruction (read, write, instruction fetch, or I/O). For example, CPUID can be executed at any privilege level to serialize instruction execution with no effect on program flow, except that the EAX, EBX, ECX, and EDX registers are modified.