Search code examples
graphicsvulkan

Difference between VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL and VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL


The documentation says:

VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL specifies a layout allowing read-only access in a shader as a sampled image, combined image/sampler, or input attachment. This layout is valid only for image subresources of images created with the VK_IMAGE_USAGE_SAMPLED_BIT or VK_IMAGE_USAGE_INPUT_ATTACHMENT_BIT usage bits enabled.

and:

VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL specifies a layout allowing read only access as an attachment, or in shaders as a sampled image, combined image/sampler, or input attachment.

The only difference between the two is that one has "SHADER" in the name. I can't understand the difference from the description. Aren't all reads from the "shader"?


Solution

  • VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL was added much later on (i.e. in Vulkan 1.3).

    Originally if you wanted to "read" from a texture, you need to distinguish on whether you needed SHADER_READ_ONLY_OPTIMAL, TRANSFER_SRC_OPTIMAL, DEPTH_STENCIL_READ_ONLY_OPTIMAL, etc.

    But in many cases this is a no-op (i.e. it makes no difference to certain GPUs). But you still had to issue vkCmdPipelineBarrier for pointless READ -> READ transitions which can create bubbles or additional processing overhead.

    So Khronos released the more generic VK_IMAGE_LAYOUT_READ_ONLY_OPTIMAL. But please take in mind sometimes it does still matter.

    Of particular case are depth buffers, which contain many compression schemes and the depth buffer requires being decompressed before it is able to be sampled, but it doesn't require decompression if you're going to use the depth buffer as read-only depth buffer.

    Also (not-so-)older GPUs cannot read from DCC (delta color compression) color textures thus for Color Textures I'd still recommend using more verbose layouts likes SHADER_READ_ONLY_OPTIMAL et.al.

    Ultimately, it depends on your target HW (mobile? Desktop? GeForce 2080 and newer? RDNA2 and newer?) and whether it actually makes a performance difference (i.e. profile).