Search code examples
synchronizationgpuvulkanmemory-barriers

Vulkan Synchronization: Avoiding write after write hazard, why this is correct?


In the problem: "First render-pass writes to a depth attachment. Second render-pass re-uses the same depth attachment."

I saw the official wiki(vulkan wiki) says:

This is an example of a WAW (Write-After-Write) hazard, which always require a memory dependency. Even if the render-pass does not read the output of the previous pass (in fact, in this example the previously image contents are explicitly not preserved by nature of transitioning from UNDEFINED) we still need a memory dependency to ensure writes to the image are not re-ordered.

And it provides a example of it, using subpass dependency:

.srcStageMask = VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT,  // Store op is always performed in late tests, after subpass access
.dstStageMask = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, // Load op is always performed in early tests, before subpass access
.srcAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT,
.dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT

While the vulkan tutorial(vulkan tutorial depth buffer chapter) provides a seemingly different solution to that problem:

.srcStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT | VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT;
.srcAccessMask = 0
.dstStageMask = VK_PIPELINE_STAGE_COLOR_ATTACHMENT_OUTPUT_BIT | VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT;
.dstAccessMask = VK_ACCESS_COLOR_ATTACHMENT_WRITE_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT;

In this case, if we ignore the color attachment stages and access bits, then this solution seems to only provide an execution dependency but without memory dependency for operations regarding the depth attachment.

I'm not sure I'm understanding this right (the two cases regarding depth attachment are the same and the second solution only provides a execution dependency for ops regarding depth attachment) So I would be appreciated that someone could clarify why the second solution is right (or they are just essentially the same).

If I'm understanding this correct, then the solution in the vulkan tutorial cannot provide a memory barrier between multiple memory writes across different subpass, then why it is an acceptable solution?


Solution

  • The synchronization situation is as so:

    Synchronization of depth between two render passes

    Now, the Vulkan specification is quite explicit:

    • Load Op for depth attachment happens in VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT with VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT in case of a VK_ATTACHMENT_LOAD_OP_LOAD, and *_WRITE in case of CLEAR or DONT_CARE.
    • Store Op for depth attachment happens in VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT with VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT
    • Layout Transition always happens somewhere between srcStageMask and dstStageMask.

    The situation in the Wiki example is that they do a layout transition, then CLEAR load op, and then no final layout transition. And the dependency handling the synchronization is in the second render pass. So it is:

    VkSubpassDependency dependency = {
      .srcSubpass = VK_SUBPASS_EXTERNAL, // previous render passes
      .dstSubpass = 0, // this render pass
      .srcStageMask = VK_PIPELINE_STAGE_LATE_FRAGMENT_TESTS_BIT,  // Store op of previous render pass
      // layout transition happens here 
      .dstStageMask = VK_PIPELINE_STAGE_EARLY_FRAGMENT_TESTS_BIT, // Load op of this render pass
      .srcAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT, // Store Op access of previous render pass
      // the Clear Load Op (`READ` flag is redundant as per https://github.com/KhronosGroup/Vulkan-Docs/issues/2055)
      .dstAccessMask = VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_WRITE_BIT | VK_ACCESS_DEPTH_STENCIL_ATTACHMENT_READ_BIT 
      .dependencyFlags = 0
    };
    

    Satisfyingly explaining choices in a tutorial is a responsibility of the tutorial maker (especially if it differs from official materials). With any inclarities in learning materials, contact their author (and not the general community) so it can be clarified therein at the source of the inclarity rather than spread the confusion elsewhere.

    At the first sight, srcStageMask seems incorrect. It should be LATE, matching Store Op stage.

    srcAccessMask should (conservatively) be included (as WRITE matching DONT_CARE store op access), although there is some talk of eliminating access flags in case of write-after-write situation.