Search code examples
linuxembeddedlow-latencyjackcpuset

Low latency process on single core with embedded Linux


I would like to run a single low latency task (for audio, ALSA/JACK) on a separate core with an embedded Linux system. Removing scheduler and other interrupts might be the key here.

There were several approaches I found so far, e.g. cpusets and an offline scheduler from 2009 (which unfortunately does not support user space tasks).

Is there a newer/more convenient way to achieve this?

Offline scheduler


Solution

  • The topic you are looking for is called "CPU affinity". There are two main aspects to the CPU affinity: affinity of processes and affinity of the interrupts.

    To my (admittedly limited) knowledge:

    • The processes are assigned to CPUs using the taskset command. (The affinity is inherited by the child processes.)

    • The interrupts to CPU assignment on Linux can be manipulated using the /proc/irq/<n>/smp_affinity. To verify the effectiveness of the assignment, check the /proc/interrupts to see which CPUs serve which interrupts. See here.

    In your particular case, you want to reserve a single CPU (aka core) for your critical application, for example CPU0. That means that all processes and interrupts should be assigned to all but the CPU0, using the affinity mask which has the bit 0 (== CPU0) cleared, e.g. 0xfffffffe. And your critical application would have the affinity mask of 0x1, meaning that it is allowed to run only on the CPU0.

    Additionally, you might need to use the sched_setscheduler syscall in the application to set the scheduling to one of the real-time policies. That might improve the latencies of your application (but also can make worse).

    Note that tuning the CPU affinity is not a trivial endeavor and clear-cut solutions are rare. You would need to test and experiment to make sure that the configuration can sustain the performance you need. For example, it is likely that your application would communicate with the other processes. If the communication is synchronous, and the other processes are slow to react (since they have limited CPU resources), that would in turn negatively impact performance of your critical application. Same applies to the interrupt(s) of the sound card.

    Hope that helps.