Search code examples
resource-leakxperf

Diagnosing Cause of 100% CPU Usage by "System" Process


I have a Windows server application, implemented in C++ using the Win32 API, that does a lot of serial and TCP/IP communication. As it runs, CPU usage gradually increases, until it reaches 100%. Task Manager indicates that most (>75%) of the CPU usage is by the "System" process. If I kill my server process, then CPU usage returns to normal.

Are there any "easy" ways to diagnose exactly what the problem is?

I suspect that I/O connections are being opened and never closed, so the OS is spending more and more time servicing those requests, but I'd like to verify that is the case before I try to fix the problem.


Update: After playing around with xperf, I've found that the System process is spending more than half of its time in ntoskrnl.exe!KxWaitForSpinLockAndAcquire. I don't know anything about this, but the name of the function suggests to me that there might be a deadlock/contention issue.

Other functions that System is using a lot include NETIO.SYS!FilterMatchEnum, NETIO.SYS!MatchConditionOverlap, NETIO.SYS!IsFilterVisible, and MpNWMon.sys!NetFlowUpendByCompletionHandle.


Solution

  • I recommend checking out the sysinternals tools if you haven't already.

    One tool there I like very much is the Handle tool, which shows you all the files that are open in the system.

    Another one which seems directly applicable to your scenario is ProcDump, which allows you to dump process information when a given process exceeds x% of CPU usage.