I am using image block data to store semi-transparent fragments in a 3-layer stack, like this:
#define TRANSPARENCY_LAYERS 3
struct TransparentFragmentValues {
rgba8unorm<float4> color [[raster_order_group(0)]] [TRANSPARENCY_LAYERS];
float depth [[raster_order_group(0)]] [TRANSPARENCY_LAYERS];
};
These transparent fragments are blended with the rest of my renderings, at the final stage. It is working fine, except that if I enable multisample antialiasing (4 samples), it displays artefacts all over the screen:
My tiles are 16x16, and each fragment takes exactly 64 bytes (4 for the opaque color + 4 for the depth + 3 x (4+4) for the transparent data), for a total of 16kB without MSAA, and 64kB with MSAA-4, which is the memory limit on my GPU.
I suspect that it has to do with how I initialise my imageblock data:
kernel void transparent_fragment_store_init (
imageblock<TransparentFragmentValues, imageblock_layout_explicit> blockData,
ushort2 localThreadID[[thread_position_in_threadgroup]])
{
threadgroup_imageblock TransparentFragmentValues* fragmentValues = blockData.data(localThreadID);
for (short i = 0; i < TRANSPARENCY_LAYERS; i++) {
fragmentValues->color[i] = 0.0;
fragmentValues->depth[i] = INFINITY;
}
}
This function doesn't take into account that each thread has to initialise 4 TransparentFragmentValues instead of just one. But how can I index them? I can't just add more threads: it fails if it's not 16x16.
After reading the metal_imageblocks
header, I think you need to use not the T *data(ushort2 coord)
, but instead use the T *data(ushort2 coord, ushort index, imageblock_data_rate rate)
variant of the function, while passing it the imageblock_data_rate::sample
and index
is going to be your sample index.