Search code examples
c++performancecpucpu-usageframebuffer

code execution/cpu speed slows down every 2 seconds


(This is sort of a crossover software-hardware topic, it started as a programming problem for me but after all the troubleshooting I think it's probably a hardware problem (maybe better suited for Super User?) but I haven't solved it yet so I don't know for sure, and hopefully this community will have some relevant cpu theory to share. Anyway...)

I'm writing a real-time rendering program and have been plagued with a visible framerate hitch that happens consistently every 2 seconds. After much profiling I've determined that this is a performance slowdown that effects all sections of code in my program (including graphics api calls), so I think it's a cpu issue that is outside the responsibility of the program.

I can demonstrate the problem on my machine with the following code in a fresh Code::Blocks project:

#include <cstdint>
#include <iostream>
#include <chrono>

int main(int argc, char* args[])
{
    std::cout << std::fixed;

    std::chrono::system_clock::time_point runStart = std::chrono::high_resolution_clock::now();

    while(true)
    {
        uint64_t count = 0;

        std::chrono::system_clock::time_point frameStart = std::chrono::high_resolution_clock::now();
        {
            for(uint64_t i = 0; i < 100000; ++i)
                ++count;
        }
        std::chrono::system_clock::time_point frameStop = std::chrono::high_resolution_clock::now();

        double runTime = std::chrono::duration<double, std::chrono::seconds::period>(frameStop - runStart).count();
        double frameTime = std::chrono::duration<double, std::chrono::seconds::period>(frameStop - frameStart).count();

        if(frameTime > 0.0005)
            std::cout << count << " runTime: " << runTime << " \tframeTime: " << frameTime << '\n';
    }

    return 0;
}

The typical output looks something like this, which clearly shows a handful of slower frames every 2 seconds:

100000 runTime: 0.000393    frameTime: 0.000393
100000 runTime: 0.000840    frameTime: 0.000393
100000 runTime: 0.001214    frameTime: 0.000369
100000 runTime: 0.002984    frameTime: 0.000389
100000 runTime: 0.003384    frameTime: 0.000395
100000 runTime: 0.003781    frameTime: 0.000393
100000 runTime: 0.004158    frameTime: 0.000371
100000 runTime: 0.005927    frameTime: 0.000386
100000 runTime: 0.006329    frameTime: 0.000398
100000 runTime: 0.006724    frameTime: 0.000390
100000 runTime: 0.007127    frameTime: 0.000398
100000 runTime: 0.007507    frameTime: 0.000375
100000 runTime: 0.994469    frameTime: 0.000511
100000 runTime: 3.042060    frameTime: 0.000465
100000 runTime: 3.077671    frameTime: 0.000405
100000 runTime: 5.093173    frameTime: 0.000496
100000 runTime: 5.128435    frameTime: 0.000366
100000 runTime: 5.488874    frameTime: 0.000391
100000 runTime: 7.135737    frameTime: 0.000367
100000 runTime: 7.152022    frameTime: 0.000484
100000 runTime: 7.457491    frameTime: 0.000360
100000 runTime: 9.179262    frameTime: 0.000478
100000 runTime: 9.211521    frameTime: 0.000368
100000 runTime: 9.226528    frameTime: 0.000353
100000 runTime: 11.217430   frameTime: 0.000391
100000 runTime: 11.262574   frameTime: 0.000352

Maybe some other machines will show similar output? (Adjust the output threshold as necessary.)

I've tried compiling with both g++ and clang, and both produce this anomaly. (The g++ version performs a little better overall.)

It just occurred to me that since I built this computer I haven't been running any 3d apps on it other than my own project, so I tried running the Hologram demo that comes with the LunarG Vulkan API, and some screensavers, and sure enough there's that hitch every 2 seconds. (It's less noticeable in the screensavers.) So I'm relieved to know at least it's system-wide, and not something I'm doing wrong with my programs.

System specs:

  • cpu: AMD Ryzen 7 1800X
  • motherboard: MSI B350 Tomahawk Arctic
  • ram: 1x16GB DDR4 3200
  • gpu: GeForce GTX 1050 Ti
  • psu: Cougar CMX 1000
  • os: Linux Mint 18.1 64-bit

Looks like I overkilled it with the 1000W psu, so lack of wattage is not the problem. Unless perhaps there's some defect with the Cougar psu that causes a brief drop every 2 seconds?

Any ideas what could be causing this and what I can do about it?

EDIT: More details about my rendering problem:

I've been building my engine around OpenGL for a long time and have had this problem for as long as I've been working on this machine. Before that I had similar random frame skips on my old computer, which had its own issues (old gaming laptop with a 20min battery life and a propensity to overheat enough to trigger the safety shut-off) so that's why I didn't immediately think this was a hardware issue. I've been planning to port my project to Vulkan, so I put off worrying about the frame skip hoping it would just go away when I changed the rendering API. But now I am going through this Vulkan tutorial and I already see the frame skip just when drawing the spinning quad like in the tutorial.

I've been trying to hack my way around the Vulkan swap chain trying to optimize around this. The most optimal solution I have is to have a thread dedicated only to calling vkQueuePresentKHR exactly every 1/60 seconds (using std::this_thread::sleep_until to wait between calls), with the Vulkan present mode set to VK_PRESENT_MODE_IMMEDIATE_KHR, and with another thread drawing to the up to 7 other swap chain images in advance, so the swap thread doesn't have to wait for them (and I know this probably isn't the safest way to do it according to the Vulkan spec). With this setup, the call to vkQueuePresentKHR usually takes between 0.000050 and 0.000300 seconds to return, but every 2 seconds there's at least that 1 frame where it takes > 0.01 seconds, which might make sense if I was using VK_PRESENT_MODE_FIFO_KHR and the call just barely missed the vblank and had to wait for the next one, but I'm using VK_PRESENT_MODE_IMMEDIATE_KHR so I don't know what's happening, except that whatever under-the-hood code that gets invoked inside vkQueuePresentKHR is being seriously impacted by my performance anomaly.


Solution

  • The cause of the performance problem was this panel applet that monitors gpu temperature.