Search code examples
virtualizationqemu

qemu: prevent a guest from hogging CPU


I was reading some materials about qemu internals, and here it mentions that:

"Jumping into guest code takes away our control of execution and gives control to the guest. While a thread is running guest code it cannot simultaneously be in the event loop because the guest has (safe) control of the CPU. Typically the amount of time spent in guest code is limited because reads and writes to emulated device registers and other exceptions cause us to leave the guest and give control back to QEMU. In extreme cases a guest can spend an unbounded amount of time without giving up control and this would make QEMU unresponsive. In order to solve the problem of guest code hogging QEMU's thread of control signals are used to break out of the guest. A UNIX signal yanks control away from the current flow of execution and invokes a signal handler function. This allows QEMU to take steps to leave guest code and return to its main loop where the event loop can get a chance to process pending events."

So it's not clear to me what generates signals (qemu's IO thread or the kernel?) and how does it help to break out of the thread executing guest code? If the kernel sends such signal to the qemu process, then I would assume that qemu intentionally injects certain instructions (binary translation) in the guest code, which result in exceptions and then signals?


Solution

  • No, QEMU isn't injecting anything into the guest code. Typically a (host) signal is sent by another QEMU thread like the iothread to the vCPU thread. The point of doing this is that the whole mechanism of signals is that when the host kernel sends a signal to a thread, it stops that thread doing whatever it was doing (ie running guest code) and makes it run the signal handler (which is QEMU process code) instead. None of this is related to anything the guest sees like a guest CPU exception, except in the very indirect sense that once QEMU has control again it might decide that the situation means that it should now tell the guest about something by means of delivering it an exception or interrupt (eg "IO event completed, emulated SCSI controller has sent you an interrupt").

    For KVM, receiving a signal also means that the kernel will cause the vCPU thread to return from the KVM_RUN ioctl call, and it will then re-enter the QEMU main loop.

    For TCG, I think that blog post is now a bit out of date (it is 10 years old, after all[*]) -- we don't need to send a signal to the vCPU thread to make it stop running guest code, because when we translate the guest code we include at the beginning of each block of translated code a fragment which says "if a flag is set, stop". So the iothread can stop the vCPU thread just by setting the flag.

    [*] More generally, don't trust the detail in that blog post to still be correct today. Most notably, QEMU now only supports the "iothread" model, and the "non-iothread" handling has been removed completely; and in many situations TCG can support multiple vCPU threads and need not run all vCPUs on a single host thread.