On what CORE is kernel thread scheduling handled in a IA-32 Multi-Process (aka Multi-Core) environment?

I am reading the Intel® 64 and IA-32 Architectures Software Developer’s Manual: System Programming Guide to know more about how OS work, and there are a few things I can't figure out.

So I understood that the OS kernel code is mapped to all processes virtual address space and run with privileges on a system call or interruption/exception, that cores can communicate through IPIs, that threads of a process are run as tasks in the x86 architecture, that a thread switch requires changing the task register to point to the next task-state segment, etc. I also know how paging, protection and all this works.

However when Multi-Core is present, I start to get confused in what core(s) certain "global" kernel operations like thread switching are performed. I can't find it in the big document.. My main concern is on thread switching. What CORE performs the thread scheduling when the thread quantum is consumed, Is the BSP processor doing it or is this a distributed task done by BSP and all APs?

When the time interruption required to switch threads is received, do all cores receive it? How do they manage shared resources in this case? (so if one of the cores schedules a thread the other ones don't schedule it too)

Thanks in advance!!

Solution

Different operating systems are different. However:

for short-term decisions; synchronising things between 2 or more CPUs costs time, causes cache inefficiencies and reduces performance; so there's a desire for each CPU to act independently with no data shared between CPUs (e.g. with queues of waiting threads for each CPU, and an independent scheduler for each CPU).
for longer-term; you have to care about CPU load balance (e.g. you don't want one very overloaded CPU while other CPUs are idle/wasted); and power management decisions (e.g. if a CPU overheats and has to be throttled) and other possible features (e.g. "hot-plug CPUs" support) can exacerbate CPU load imbalance.

A typical modern scheduler is a compromise between these goals. E.g. a "mostly independent" scheduler for each CPU (but allowing one CPU to put thread/s on a different CPU's scheduler's data structures somehow, to move/migrate threads for load balancing).

When the time interruption required to switch threads is received, do all cores receive it?

Almost all thread switches are caused by threads blocking (having to wait to acquire a mutex, wait for disk IO, wait for time to pass, wait for user input, ...), or higher priority threads unblocking and immediately pre-empting a currently running lower priority thread (because whatever the higher priority thread was waiting for happened). Task switches caused by threads consuming their quantum is just a barely needed worst case safe-guard; and there's effectively no chance that threads on different CPUs will end their time slice at the same time. The typical/modern approach is that when switching to a thread the CPU's scheduler determines the maximum time that the thread should be allowed to have and then asks a timer to notify it when the determined amount of time has elapsed. Most hardware has a timer for each CPU; so the "time quantum" stuff all ends up being "per CPU, nothing shared between CPUs". The bigger problem is that often the same "per CPU" timers are also used for everything else (e.g. TCP/IP time-outs, "sleep()", tracking how long things have been idle, ...) so you end up with a generic system with more overhead, where the cost of the scheduler creating a "timer event" and then cancelling it slightly later because it wasn't needed still becomes a noticeably silly pile of bloat.