Search code examples
multithreadingparallel-processingcomputer-sciencecpu-architecture

Hardware Multithreading and Simultaneous Multithreading(SMT)


I'm reading Multithreading (computer architecture) - Wiki, aka hardware threading, and I'm trying to understand the second paragraph:

(p2): Where multiprocessing systems include multiple complete processing units in one or more cores, multithreading aims to increase utilization of a single core by using thread-level parallelism, as well as instruction-level parallelism.

while the link to thread-level parallelism says:

(Link): Thread-level parallelism (TLP) is the parallelism inherent in an application that runs multiple threads at once. This type of parallelism is found largely in applications written for commercial servers such as ...

which is not so useful... So I read task parallelism above, since I guess TLP is a subtype of it:

Task parallelism (also known as function parallelism and control parallelism) is a form of parallelization of computer code across multiple processors in parallel computing environments. Task parallelism focuses on distributing tasks—concurrently performed by processes or threads—across different processors.

Question: If thread-level parallelism is task parallelism, and task parallelism is for parallelization across multiple processors, how increase utilization of a single core by using thread-level parallelism work?

Guessing: I guess for TLP, it should mean across multiple logical processors, i.e. hardware threads in the perspective of OS, correct?


Another minor issue is that for my first link, Multithreading:

In computer architecture, multithreading is the ability of a central processing unit (CPU) (or a single core in a multi-core processor) to execute multiple processes or threads concurrently, supported by the operating system.

And in (p2) it aim to increase utilization of a single core by using thread-level parallelism? What a contradiction.


Solution

  • I don't think we should base off wiki definitions, the wording there is not accurate enough to merit searching for contradictions.

    First, I would describe task parallelism as a form of parallelism inherent to some algorithm or problem, where there could be a functional decomposition into multiple tasks with different nature, that can run concurrently. Alternative forms of parallelism include for example spatial or data decomposition, where the problem can be broken into different parts of the data or the input layout (e.g., array ranges, matrix tiles, image parts...).

    Thread-level parallelism is a different taxonomy, it is any form of parallelism that can be extracted for utilization by a multi-threaded system. It requires the decomposition to be coarse grained enough to allow the different threads to run independently (otherwise the synchronization overhead required would make it useless). The alternative for that is for example ILP (instruction level parallelism) which is when a single thread context can extract parallelism within the code by running over a deep out-of-order machine that can schedule based on readiness. This allows more fine-grained parallelism and less programmer involvement usually, but limits the parallelism to the depth of the OOO window.

    On a related topic - be careful not to confuse simultaneous execution and concurrent one.

    Thread level parallelism can be used by extracting task-level parallelism or other forms of algorithm decomposition from the code. It can then be run on a system that is single-core (preemptive), or multi-threaded. The latter type can be achieved through multi-core systems, simultaneous multi-threading or both (common processors usually have many cores, and may of them support SMT on top of that).