Background:
When tasks are scheduled by a non-worker thread to be processed by worker threads in the thread pool, they enter a global queue. Worker threads use this global queue to fetch tasks to process in FIFO(First in first out) order. This makes sense as the task that was scheduled first is expected to complete first.
Now, if a worker process is processing a task and it spawns new subtasks, they are stored in the process's local queue. When the worker process completes its current task, it checks the local queue and extracts the task from the queue using LIFO (Last in first out) order.
[Edit] Sources:
Question:
What is the rationale behind using the LIFO order with the local queue?
This is explicitly mentioned in Microsoft's documentation for the task scheduler:
The local queues are accessed on last-in, first-out order (LIFO) in order to preserve cache locality and reduce contention.
Joe Duffy also discusses this more here. He says:
1) By executing the work most recently pushed into the queue in LIFO order, chances are that memory associated with it will still be hot in the cache.
2) By stealing in FIFO order, chances are that a larger “chunk” of work will be stolen (possibly reducing the chance of needing additional steals). The reason for this is that many work stealing workloads are divide-and-conquer in nature; in such cases, the recursion forms a tree, and the oldest items in the queue lie closer to the root; hence, stealing one of those implicitly also steals a (potentially) large subtree of computations that will unfold once that piece of work is stolen and run.