According to some operating system textbooks, for faster context switches, people add ASID for each process in the TLB tag field, so we don't need to flush the entire TLB in a context switch.
I have heard that some ARM processors and MIPS processors do have ASID in TLB. But I am not sure whether Intel x86 processors have ASID.
Meanwhile, it seems ASID usually has fewer bits (e.g 8 bits) than PID (32 bits). So, how does the system handle "ASID overflow" if we have more processes in the memory than 2^8 in the 8-bit ASID case mentioned above?
Intel calls ASIDs process-context identifiers (PCIDs). On all Intel processors that support PCIDs, the size of a PCID is 12 bits. They constitute bits 11:0 of the CR3 register. By default, on processor reset, CR4.PCIDE (bit 17 of CR4) is cleared and CR3.PCID is zero and so if the OS wants to use PCIDs, it has to set that CR4.PCIDE first to enable the feature. Writing a PCID value larger than zero is only allowed when CR4.PCIDE is set. That said, when CR4.PCIDE is set, it is also possible to write zero to CR3.PCID. Therefore, the maximum number of PCIDs that can be simultaneously used is 2^12 = 4096.
I'll discuss how the Linux kernel allocates PCIDs. The Linux kernel itself actually uses the term ASIDs even for Intel processors and so I'll use this term as well.
In general, there are really many ways to manage the ASID space such as the following:
Linux uses the last method and I'll discuss it in some additional detail.
Linux only remembers the last 6 ASIDs used on each core. This is specified by the TLB_NR_DYN_ASIDS macro. The system creates a data structure for each core of type tlb_state that defines an array as follows:
struct tlb_context {
u64 ctx_id;
u64 tlb_gen;
};
struct tlb_state {
.
.
.
u16 next_asid;
struct tlb_context ctxs[TLB_NR_DYN_ASIDS];
};
DECLARE_PER_CPU_SHARED_ALIGNED(struct tlb_state, cpu_tlbstate);
The type includes other fields but I've shown only two for brevity. Linux defines the following ASID spaces:
TLB_NR_DYN_ASIDS
). These values are stored in the next_asid
field and used as indices to the ctxs
array.TLB_NR_DYN_ASIDS
+ 1). These values are actually stored in CR3.PCID.TLB_NR_DYN_ASIDS
+ 1). These values are actually stored in CR3.PCID.Each process has a single canonical ASID. This is the value used by Linux itself. Each canonical ASID is associated with a kPCID and a uPCID, which are the values that are actually stored in CR3.PCID. The reason for having two ASIDs per process is to support page-table isolation (PTI) which mitigates the Meltdown vulnerability. In fact, with PTI, each process has two virtual address spaces, each has its own ASID, but the two ASIDs have a fixed arithmetic relationship as shown above. So even though Intel processors support 4096 ASIDs per core, Linux only uses 12 per core. I'll get to the ctxs
array, just bear with me a little.
Linux assigns ASIDs to processes dynamically on context switches, not on creation. The same process may get different ASIDs on different cores and its ASID may change dynamically whenever a thread of that process is scheduled to run on a core. This is done in the switch_mm_irqs_off function, which gets called whenever the scheduler switches from one thread to another on a core, even if the two threads belong to the same process. There are two cases to consider:
In this case, the kernel executes the following function call:
choose_new_asid(next, next_tlb_gen, &new_asid, &need_flush);
The first argument, next
, points to the memory descriptor of the process to which the thread that scheduler selected to resume belongs. This object contains many things. But one thing we care about here is ctx_id
, which is a 64-bit value that is unique per existing process. The next_tlb_gen
is used to determine whether a TLB invalidation is required or not as I'll discuss shortly. The function returns new_asid
which holds the ASID assigned to the process and need_flush
which says whether a TLB invalidation is required. The return type of the function is void
.
static void choose_new_asid(struct mm_struct *next, u64 next_tlb_gen,
u16 *new_asid, bool *need_flush)
{
u16 asid;
if (!static_cpu_has(X86_FEATURE_PCID)) {
*new_asid = 0;
*need_flush = true;
return;
}
if (this_cpu_read(cpu_tlbstate.invalidate_other))
clear_asid_other();
for (asid = 0; asid < TLB_NR_DYN_ASIDS; asid++) {
if (this_cpu_read(cpu_tlbstate.ctxs[asid].ctx_id) !=
next->context.ctx_id)
continue;
*new_asid = asid;
*need_flush = (this_cpu_read(cpu_tlbstate.ctxs[asid].tlb_gen) <
next_tlb_gen);
return;
}
/*
* We don't currently own an ASID slot on this CPU.
* Allocate a slot.
*/
*new_asid = this_cpu_add_return(cpu_tlbstate.next_asid, 1) - 1;
if (*new_asid >= TLB_NR_DYN_ASIDS) {
*new_asid = 0;
this_cpu_write(cpu_tlbstate.next_asid, 1);
}
*need_flush = true;
}
Logically, the function works as follows. If the processor does not support PCIDs, then all processes get an ASID value of zero and a TLB flush is always required. I'll skip the invalidate_other
check since it's not relevant. Next, the loop iterates over all of the 6 canonical ASIDs and use them as indices into the ctxs
. The process that has context identifier of cpu_tlbstate.ctxs[asid].ctx_id
is currently assigned the ASID value asid
. So the loop checks whether the process still has an ASID assigned it. In this case, the same ASID is used and need_flush
updated based on next_tlb_gen
. The reason that we may need to flush the TLB entries associated with the ASID even though the ASID was not recycled is due to the lazy TLB invalidation mechanism, which is beyond the scope of your question.
If none of the currently used ASIDs have been assigned to the process, then we need to allocate a new one. The call to this_cpu_add_return
simply increments the value in next_asid
by 1. This gives us a kPCID value. Then when subtracted by 1, we get the canonical ASID. If we have exceeded the maximum canonical ASID value (TLB_NR_DYN_ASIDS
), then we wraparound to the canonical ASID zero and write the corresponding kPCID (which is 1) to next_asid
. When this happens, it means that some other process was assigned the same canonical ASID and so we definitely want to flush the TLB entries associated with that ASID on the core. Then when choose_new_asid
returns to switch_mm_irqs_off
, ctxs
array and CR3 are updated accordingly. Writing to CR3 will make the core automatically flush the TLB entries associated with that ASID. If the process whose ASID was reassigned to another process is still alive, then the next time one of its threads run, it will get assigned a new ASID on that core. This whole process happens per core. Otherwise, if that process is dead, then at some point in the future, its ASID will get recycled.
The reason that Linux uses exactly 6 ASIDs per core is that it makes the size of the tlb_state
type small just enough to fit within two 64-byte cache lines. Generally, there can be dozens of processes that are simultaneously alive on a Linux system. However, most of them are typically dormant. So the way Linux manages the ASID space is practically very efficient. Although it would be interesting to see an experimental evaluation on the impact of the value of TLB_NR_DYN_ASIDS
on performance. But I'm not aware of any such published study.