Search code examples
x86interruptosdevpciapic

How does MSI-X triggers interrupt handlers? Is there a need to poll the chosen memory address?


I have a small kernel which is booted with UEFI. I'm using QEMU for virtualization. I want to write a xHCI driver to support USB keyboards in my kernel. I'm having trouble to find concise and clear information. I "found" the xHCI in my kernel. I have a pointer to its PCI configuration space. It is MSI-X capable. I want to use MSI-X but I'm having trouble to understand how that works with the xHCI and USB.

My problem is that normally osdev.org is quite informational and has the basis I need to implement some functionality. In the case of MSI-X, it doesn't seem to be the case. I'm having a hard time to make the link between all the information I have on osdev.org with the MSI-X functionality.

So basically, I find the MSI-X table and then I set some addresses there to tell the xHCI PCI device to write to that address to trigger an interrupt. But is an interrupt handler called at some point? Do I need to poll this address to determine if an interrupt occured? I would have thought that the Vector Control field in the MSI-X table let me set an interrupt vector but all the bits are reserved.

EDIT

I found the following stackoverflow Q&A which partially answers my question: Question about Message Signaled Interrupts (MSI) on x86 LAPIC system.

So basically, the low byte of the data register contains the vector to trigger and the message address register contains the LAPIC id to trigger. I still have some questions.

  1. Why does the "Message Address register contains fixed top of 0xFEE".

  2. What are the RH, DM and XX bits in the Message Address register?

  3. How does this work with the LAPIC? Basically, how does it trigger interrupts in the LAPIC. Is it a special feature of PCI devices which allows them to trigger interrupts in the LAPIC. Or is it simply that PCI devices write to memory mapped registers of the LAPIC with some specific data which triggers an interrupt. Because normally the LAPIC is accessed from within the core at an address which is the same for every LAPIC. Is it some kind of inter-processor interrupt from outside the CPU?


Solution

    1. Why does the "Message Address register contains fixed top of 0xFEE".

    CPUs are like networking - packets with headers describing what their contents are, that are routed around a set of links based on "addresses" (which are like the IP address in a TCP/IP packet).

    MSI is essentially a packet saying "write this data to that address", where the address corresponds to another device (the local APIC inside a CPU) and is necessary because that's what the protocol/s for the bus require for that packet/message type, and because that tells the local APIC that it has to accept the packet and intervene. If the address part is wrong then it'd just look like any other write to anything else (and wouldn't be accepted by the local APIC and wouldn't be delivered as an IRQ to the CPU's core).

    Note: In theory, for most (Intel) CPUs the physical address of the local APIC can be changed. In practice there's no sane reason to ever want to do that, and if the phgysical address of the local APIC is changed I think the standard/original "0xFEE..... " address range is still hard-wired into the local APIC for MSI acceptance.

    1. What are the RH, DM and XX bits in the Message Address register?

    The local APIC (among other uses) is used by software (kernel) on one CPU to send IRQ/s to other CPUs; called "inter-processor interrupts" (IPIs). When MSI got invented they simply re-used the same flags that already existed for IPIs. In other words, the DM (Destination Mode) and most of other bits are defined in the section of Intel's manual that describe the local APIC. To understand these bits properly you need to understand the local APIC and IPIs; especially the part about IPI delivery.

    Later (when introducing hardware virtualization) Intel added a "redirection hint" (to allow IRQs from devices to be redirected to specific virtual machines). That is described in a specification called "Intel® Virtualization Technology for Directed I/O Architecture Specification" (available here: https://software.intel.com/content/www/us/en/develop/download/intel-virtualization-technology-for-directed-io-architecture-specification.html ).

    Even later, Intel wanted to support systems with more than 255 CPUs but the "APIC ID" was an 8-bit ID (limiting the system to a max. of 255 CPUs and one IO APIC). To fix this they created x2APIC (which changed a bunch of things - 32-bit APIC IDs, local APIC accessed by MSRs instead of physical addresses, ...). However, all the old devices (including IO APICs and MSI) were designed for 8-bit APIC IDs, so to fix that problem they just recycled the same "IRQ remapping" they already had (from virtualization) so that IRQs with 8-bit APIC IDs could be remapped to IRQs with 32-bit APIC IDs. The result is relatively horrible and excessively confusing (e.g. a kernel that wants to support lots of CPUs needs to use IOMMU for things that have nothing to do with virtualization), but it works without backward compatibility problems.

    1. How does this work with the lAPIC? Basically, how does it trigger interrupts in the lAPIC.

    I'd expect that (for P6 and later - 80486 and Pentium used a different "3-wire APIC bus" instead) it all uses the same "packet format" (messages) - e.g. that the IO APIC sends the same "write this data to this address" packet/message (to local APIC) that is used by IPIs and MSI.

    Is it some kind of inter-processor interrupt from outside the CPU?

    Yes! :-)