Search code examples
cgpunvidiaglfwvulkan

SYNC-HAZARD-WRITE-AFTER-READ error with swapchain acquisition and command buffer submission


I wrote a minimal reproducible code example of a Vulkan validation error apparently related to the synchronization between swapchain acquisition and command buffer submission (see the full code example on Gist).

The example C code should just create a black window with GLFW and return immediately after a single frame.

The relevant parts are here (I removed the fences to simplify the code, the error is the same with or without them):

    VkSemaphore imageAvailableSemaphore = createSemaphore(device);
    VkSemaphore renderFinishedSemaphore = createSemaphore(device);

    uint32_t imageIndex;
    vkAcquireNextImageKHR(
        device, swapChain, UINT64_MAX, imageAvailableSemaphore, VK_NULL_HANDLE, &imageIndex);

    VkSemaphore waitSemaphores[] = {imageAvailableSemaphore};
    VkSemaphore signalSemaphores[] = {renderFinishedSemaphore};
    VkPipelineStageFlags waitStages[] = {VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT};

    VkSubmitInfo submitInfo = {};
    submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
    submitInfo.waitSemaphoreCount = 1;
    submitInfo.pWaitSemaphores = waitSemaphores;
    submitInfo.pWaitDstStageMask = waitStages;
    submitInfo.signalSemaphoreCount = 1;
    submitInfo.pSignalSemaphores = signalSemaphores;
    submitInfo.commandBufferCount = 1;
    submitInfo.pCommandBuffers = &commandBuffers[imageIndex];

    VkQueue graphicsQueue;
    vkGetDeviceQueue(device, queueFamilyIndex, 0, &graphicsQueue);
    vkQueueSubmit(graphicsQueue, 1, &submitInfo, NULL);

The command buffer is just:

        vkBeginCommandBuffer(commandBuffers[i], &beginInfo);
        vkCmdBeginRenderPass(commandBuffers[i], &renderPassInfo, VK_SUBPASS_CONTENTS_INLINE);
        vkCmdEndRenderPass(commandBuffers[i]);
        vkEndCommandBuffer(commandBuffers[i]);

The following SYNC-HAZARD-WRITE-AFTER-READ validation error is raised on queue submission:

    validation layer: Validation Error: [ SYNC-HAZARD-WRITE-AFTER-READ ] Object 0: handle =
    0x5599afbbf520, type = VK_OBJECT_TYPE_QUEUE; | MessageID = 0x376bc9df | vkQueueSubmit(): Hazard
    WRITE_AFTER_READ for entry 0, VkCommandBuffer 0x5599b202e960[], Submitted access info
    (submitted_usage: SYNC_IMAGE_LAYOUT_TRANSITION, command: vkCmdBeginRenderPass, seq_no: 1,
    renderpass: VkRenderPass 0xcad092000000000d[], reset_no: 1). Access info (prior_usage:
    SYNC_PRESENT_ENGINE_SYNCVAL_PRESENT_ACQUIRE_READ_SYNCVAL, read_barriers:
    VK_PIPELINE_STAGE_2_COLOR_ATTACHMENT_OUTPUT_BIT|VK_PIPELINE_STAGE_2_BOTTOM_OF_PIPE_BIT, ,
    batch_tag: 1, vkAcquireNextImageKHR aquire_tag:1: VkSwapchainKHR 0xf443490000000006[],
    image_index: 0image: VkImage 0xcb3ee80000000007[]).

This error only occurs on a specific configuration (Linux with the latest LunarG SDK v1.3.275). I don't know if the problem is in my code or in the validation layers code.

I should add that the error disappears if I replace VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT by VK_PIPELINE_STAGE_ALL_COMMANDS_BIT in VkSubmitInfo.pWaitDstStageMask (but then I get a performance warning).

What am I doing wrong?

My configuration:

Ubuntu 22.04
LunarG SDK version 1.3.275
Vulkan Instance Version: 1.3.275
NVIDIA GeForce RTX 2070 SUPER
NVIDIA-SMI 535.129.03
Driver Version: 535.129.03
CUDA Version: 12.2

Solution

  • I think that this discussion in the Validation Layer GitHub explains the issue and how to fix it.

    The important part of the discussion is:

    There should be additional execution dependency with COLOR_ATTACHMENT_OUTPUT stage, otherwise image transition can start before image acquire operation finished. The destination stage is set to COLOR_ATTACHMENT_OUTPUT and this guarantees that writes from rasterization will wait. But the image layout transition happens earlier, and by setting source stage to COLOR_ATTACHMENT_OUTPUT we create execution dependency with semaphore wait operation (we chain with pWaitDstStageMask which also specifies COLOR_ATTACHMENT_OUTPUT ).

    I actually reproduced your problem and added a barrier to add the dependency as described above, and the validation message no longer occurs. This may not be exactly the same situation as yours, but it seems likely.