Search code examples
texture-mappingvertex-shaderpixel-shaderdirectx-12

Weird texture glitch in DirectX 12 when using dynamic indexing


Recently, I implemented the texture loading in my engine. But the textures would have some small glitches sometimes.

The problem is shown as the following picture. enter image description here

This is tested with AMD R9 380. I also tried to execute my program on Intel Graphic HD 4600, but it drawn nothing at all. (No geometry is shown.)

My shader code:

struct MaterialData
{
    float4 gDiffuseAlbedo;
    float3 gFresnelR0;
    float  gRoughness;
    float4x4 gMatTransform;
    uint gDiffuseMapIndex;
    uint MaterialPad0;
    uint MaterialPad1;
    uint MaterialPad2;
};
StructuredBuffer<MaterialData> gMaterialData : register(t1, space1);

// texture array for dynamic indexing
// gMaxNumOfTextures is set to 16 at now
Texture2D gCommonTexture[gMaxNumOfTextures] : register(t3);

// in my pixel shader:
float4 diffuseAlbedoMap = gCommonTexture[matData.gDiffuseMapIndex].Sample(gsamAnisotropicWrap, pin.TexC) * matData.gDiffuseAlbedo;

I already double-checked the matData.gDiffuseMapIndex, and it is correct.

I tried everything to fix this but in vain. Until I modified my code to this:

// in my pixel shader:
float4 diffuseAlbedoMap = float4(0,0,0,0);

uint realMatIndex;
for(realMatIndex = 0;realMatIndex < gMaxNumOfTextures;realMatIndex++)
    if(realMatIndex == matData.gDiffuseMapIndex)
    {
        diffuseAlbedoMap = gCommonTexture[matData.gDiffuseMapIndex].Sample(gsamAnisotropicWrap, pin.TexC) * matData.gDiffuseAlbedo;
        break;
    }

This works properly!

No glitches on my R9 380.

And All geometries are shown normally on HD 4600.

BUT WHY!?

Why do I need to use a for loop to check the index again for preventing glitches?

(If I use more textures, this may not work efficiently.)

What can cause this problem?

I thought this problem whole night but I couldn't find the answer.

Thanks!

Full shader code: enter link description here


Solution

  • You need to use NonUniformIndex(matData.gDiffuseMapIndex) to let the compiler know the index is non uniform, obviously.

    This is because on GCN hardware, the texture descriptor is stored in the scalar register shared accross the wave. The intrinsic does what you did, a loop, but does it a little more efficiently by masking threads per unique index found until all threads are done.

    This works of course, but if you multiply the non uniform cases, you will ends with a very non optimal shader. To improve that, the only way is to not use the intrinsic and guarantee that the index will be uniform in the wave. Splitting different materials into separate draw calls inside an ExecuteIndirect for example.