Search code examples
c#performancelagslimdxdxgi

Swapchain.Present() taking far too long, causing lag


I've recently been getting a bit of lag since I moved all of my c# SlimDX DX11 rendering code from my Form (yes, I'm a lazy developer) to bespoke classes. I whacked my program into EQATEC Profiler and got this as the major contributor to my lag:

EQATEC Profiler results showing function postRender() to be at fault

Now it's clear here that whatever's in postRender() is really hogging the precious milliseconds. In fact, whatever crazy, convoluted code I have in there is effectively reducing my frame rate to ~15 FPS on its own.

So what's in postRender()? Just one line of code:

swapChain.Present(0, PresentFlags.None);

I just have no idea what's caused it to slow down so much, I've not made any changes to the swapchain code at all. All I've altered is the screen resolution (1680x1050), but that should be absolutely fine (for reference, this machine can run crysis2 at maximum settings at that resolution without breaking a sweat).

Does anybody have any idea what might cause a swapchain to take so long on presenting or where I should look next for problems?

EDIT:

Looking at the structure of my code, my RenderFrame() function is as follows:

preRender();
DeferredRender(preShader);
//Composite scene to output image
CompositeScene(compositeShader);
//Post Process
PostProcess(postProcShader);
//Depth of Field
DoF(dofShader);
//Present the swapchain
postRender();

The results of some of these functions are based on the functions before (for example, DeferredRender uses four render targets to capture Diffuse lighting, Normals, Positions and Color in a per-pixel manner. CompositeScene then puts them all together. This would require the GPU to have computed the previous step before it can continue. This whole process continues along, with DoF requiring the results of PostProcess, etc. Therefore the only shader that could possibly be holding Swapchain.Present() up must be the shader which runs in the function DoF, as all the other shaders cause the CPU to lock until they're finished. Correct?


Solution

  • There are a few reasons why you might find Present() taking up that much time in your frame. The Present call is the primary method of synchronization between the CPU and the GPU; if the CPU is generating frames much faster than the GPU can process them, they'll get queued up. Once the buffer gets full, Present() turns into a glorified Sleep() call while it waits for it to empty out.

    Of course, it's pretty much impossible to say with the little information that you've given here. Just because a machine runs Crysis doesn't mean you can get away with throwing anything you want at the card. Double check that you're not expecting to render crazy amounts of geometry and that your shaders aren't abnormally long and complex.

    Also take a look at your app using one of the available GPU profilers; PIX is good as a base point, while NVIDIA and AMD each have their own more specific offerings for their own products. Finally, make sure your drivers are updated. If there's a bug in the driver, any chance you have at reasoning about the issue goes out the window.