Huge asynchronous compute work stalls rendering in Vulkan

I have a usual rendering loop and some big compute dispatches running on a dedicated queue from a compute-only family. They are running on a single VkDevice, but otherwise are completely independent. Compute work basically consists of a single dispatch, which can take up to a few seconds to complete. And I see freezes up to 200ms in rendering with that compute work running in background, though if I reduce a compute workload so that it takes around 0.5sec, there're no lags at all. I use Nvidia Quadro T2000 on Windows, priorities for rendering and compute queues are 1.0 and 0.0 respectively.

I guess splitting compute work into separate smaller submits would help, but I am wondering what can cause such behavior? Or maybe which profiling tools can help me diagnosing this?

Solution

Teh Vulkan specc:

There are no implicit ordering constraints between queue operations on different queues, or between queues and the host, so these may operate in any order with respect to each other. Explicit ordering constraints between different queues or with the host can be expressed with semaphores and fences.

priorities for rendering and compute queues are 1.0 and 0.0 respectively.

→

No specific guarantees are made about higher priority queues receiving more processing time or better quality of service than lower priority queues.

Additionally, if you let something run for couple of seconds, it has a tendency to trigger a watchdog on modern operating systems. You could check whether the hitch is caused by driver restart.