Search code examples
c++cx86numacpuid

How to find the L3 cache index and NUMA node index for the current hardware thread


I'm building a topological tree of sockets, NUMA nodes, caches, cores, and threads for any Intel or AMD system in C.

Building this hierarchy, I want to ensure hardware threads are grouped together appropriately so it's clear who precisely shares what. I've found that I can set a thread's affinity and then use the cpuid instruction to get a lot of the info I want, but not all.

If a package/socket has multiple NUMA nodes, how do I get an index of the NUMA node for the current hardware thread? If the NUMA node has multiple L3 caches, how do I get the index?

AMD has something for NUMA node ID in Fn8000_001E_ECX, but I can't find anything comparable for Intel. And nothing re: L3 index for either.


Solution

  • If a package/socket has multiple NUMA nodes, how do I get an index of the NUMA node for the current hardware thread?

    You get this information from ACPI.

    Specifically, there's a "System Resource Affinity Table" (SRAT) that contains a list of structures describing which NUMA domain different things (CPUs, memory areas, ...) are in at boot time. For 80x86; you'd parse this list looking for both "Processor Local APIC/SAPIC Affinity Structures" and "Processor Local x2APIC Affinity Structures".

    For hot-plug CPUs the table isn't enough (the SRAT won't change when a CPU is inserted or removed after boot), so you might also need to use an ACPI machine language interpreter to execute _PXM objects to obtain current NUMA information. Computers that support hot-plug CPUs is very rare though.

    Note that in ACPI "NUMA domain numbers" are excessively large (32 bits) and not guaranteed to be contiguous (e.g. in theory you could have 2 NUMA nodes with the NUMA domain numbers 0x12345678 and 0x9ABCDEF0); which means that you can't use them for array indexes (e.g. if you want to do something like "NUMA_stats[domain].CPU_count++;" it won't be fun). There is also no standard value reserved for "unknown NUMA domain", which is inconvenient for code that determines topology (e.g. you'd need a separate "did/didn't find a valid NUMA domain" flag to keep track).