I'm implementing a PCIe driver, and I'd like to understand at what level the interrupts can be or should be enabled/disabled. I intentionally do not specify OS, as I'm assuming it should be relevant for any platform. By levels I mean the following:
- OS specific interrupts handling framework
- Interrupts can be disabled or enabled in the PCI/PCIe configuration space registers, e.g. COMMAND register
- Interrupts also can be masked at device level, for instance we can
configure device not trigger certain interrupts to the host
I understand that whatever interrupt type is being used on PCIe (INTx emulation, MSI or MSI-X), it has to be delivered to the host OS.
So my question is really -- do we actually have to enable or disable interrupts on every layer, or it's sufficient only at the closest to hardware, e.g. in relevant PCI registers?
Disabling interrupts at the various levels usually has completely different purposes.
Disabling interrupts:
- In the OS (really, this means in the CPU) - This is generally about avoiding race conditions. In particular, if state/memory corruption could occur during a particular section of code if the CPU happened to be interrupted, then that section of code will need to disable interrupt handling. Interrupt handlers must not acquire normal locks (by definition they can't be suspended), and they must not attempt to acquire a spin-lock that is held by the thread currently scheduled on the same CPU (because that thread is blocked from progressing by the very same interrupt handler!) so ensuring data safety with interrupt handlers can be tricky. Handling interrupts promptly is generally a good thing, so you want to absolutely minimise such sections in any code you write. Do as much of your interrupt handling in secondary interrupt handlers as possible to avoid such situations. Secondary interrupt handlers are really just callbacks on a regular OS thread which doesn't have any of the restrictions of a primary interrupt handler.
- PCI/PCIe configuration - It's my understanding this is mainly about routing interrupts, and is something you normally do once when your driver loads (or is activated by a client) and again when your driver unloads (or is deactivated). This may also be affected by power management events. In some OSes, the PCI(e) level is actually handled for you when you activate PCI device interrupts via higher-level APIs.
- On-device - This is usually an optimisation to avoid interrupting the CPU when it doesn't need to be interrupted. The most common scenario is that an event happens on the device, so an interrupt is generated. The driver's primary interrupt handler checks the device registers if the driver needs to do any processing. If so, it disables interrupts on the device, and schedules the driver's secondary interrupt handler to run. The OS eventually runs the secondary handler, which processes whatever information the device has provided, until it runs out of things to do. Then it enables interrupts again, checks once more if there's any work pending from the device and if there are none, it terminates. (If there are items to process in this last check, it re-disables interrupts and starts over from the beginning.) The idea is that until the secondary interrupt handler has finished processing, there really is no point triggering the primary interrupt handler, and a waste of resources, if additional events arrive, because the driver is already busy processing the event queue. The final check after re-enabling interrupts is to avoid a race condition between an event arriving and re-enabling interrupts.
I hope that answers your question.