Search code examples
linuxlinux-kernelinterruptrace-conditioninterrupt-handling

Inherent race condition in Linux IRQ handlers


Suppose there is an port-mapped I/O device which arbitrarily generates interrupts on an IRQ line. The device's pending interrupts may be cleared via a single outb call to a particular register.

Furthermore, suppose the follow interrupt handler is assigned to the relevant IRQ line via request_irq:

irqreturn_t handler(int irq, void *data)
{
        /* clear pending IRQ on device */
        outb(0, CLEAR_IRQ_REGISTER_ADDR);

        /* device may generate another IRQ at this point,
         * but this handler function has not yet returned */

        /* signal kernel that IRQ has been handled */
        return IRQ_HANDLED;
}

Is there an inherent race condition in this IRQ handler? For example, if the device generates another interrupt after the "clear IRQ" outb call, but before the handler function returns IRQ_HANDLED, what will happen?

I can think of three scenarios:

  1. IRQ line freezes and can no longer be handled due to deadlock between the device and Linux kernel.
  2. Linux kernel executes handler again immediately after return, in order to handle second interrupt.
  3. Linux kernel interrupts handler with second call to handler.

Solution

  • Scenario 2 is the correct one. Interrupts handlers are running with interrupts disabled on the local CPU. So after returning from your handler, the interrupt controller will see that another interrupt occured and your handler will get called again.

    What may happen though is that you may miss some interrupts if your are not fast enough and multiple interrupts happen while your are still handling the first one. This should not happen in your case because you have to clear the pending interrupt.

    Andy's answer is about another issue. You definitively have to lock access to your device and resources because your handler may run concurrently on different CPUs.