parallel-processing pipeline cpu-architecture vliw

Instruction Level Parallelism (ILP) Methods

I'm trying to learn about the methods used in instruction level parallelism and the differences between them. My question here is, given an instruction set that was initially made to run at a processor without instruction level parallelism, which one of these methods can be used in order to achieve instruction level parallelism on a new processor and why/how. The new processor will execute the same instruction set and run the same program binaries identical to the original one, but the performance will be better. The options are:

1)Out-of-order execution(Tomasulo Algorithm)

2)Pipelining

3)Superscalar

4)VLIW

Solution

I would say OOO will be the first thing that will highly increase ILP. OOO architectures are hardware techniques that are totally independent of the workings of compilers (meaning that OOO architecture will carry out the same computations of a CPU without OOO and producing the same results with less time with no change to the instructions structure at all)

Pipe-lining is a well known and old technique to increase ILP but it has its limitations, adding stages increase hardware complexity and eventually will give a diminishing returns.

VLIW and superscalar are essentially the same but they are different style of parallelism, they require special hardware and special compilers, so they are not compatible with the conventional control-flow architecture. This technique essentially rely on compilers to pack more than instruction in one Very Long Instruction Word (VLIW) that can be executed in parallel.