Search code examples
graphicsglslshadervulkan

Can you use the built in derivative functions in compute shaders? (vulkan)


I want to use the built in derivative funcitons:

    vec3 dpdx = dFdx(p);
    vec3 dpdy = dFdy(p);

Inside a compute shader. However I get the following error:

Message ID name: UNASSIGNED-CoreValidation-Shader-InconsistentSpirv
Message: Validation Error: [ UNASSIGNED-CoreValidation-Shader-InconsistentSpirv ] Object 0: handle = 0x5654380d4dd8, name = Logical device: GeForce GT 1030, type = VK_OBJECT_TYPE_DEVICE; | MessageID = 0x6bbb14 | SPIR-V module not valid: OpEntryPoint Entry Point <id> '5[%main]'s callgraph contains function <id> 46[%BiplanarMapping_s21_vf3_vf3_f1_], which cannot be used with the current execution modes:
Derivative instructions require DerivativeGroupQuadsNV or DerivativeGroupLinearNV execution mode for GLCompute execution model: DPdx
Derivative instructions require DerivativeGroupQuadsNV or DerivativeGroupLinearNV execution mode for GLCompute execution model: DPdy

  %BiplanarMapping_s21_vf3_vf3_f1_ = OpFunction %v4float None %41

Severity: VK_DEBUG_UTILS_MESSAGE_SEVERITY_ERROR_BIT_EXT

I don't seem to find anything on the topic when I search online.


Solution

  • Derivative functions only work in a fragment shader. The derivatives are based on the rate-of-change of the value across the primitive being rendered. Obviously compute shaders don't render primitives, so there is nothing to compute.

    Apparently, NVIDIA has an extension that provides some derivative computation capabilities for compute shaders. That's where the weird error comes from.

    Derivatives in fragment shaders are computed by subtracting between the same value from adjacent invocations. As such, you can emulate this by using shared variables.

    First, you have to make sure that the spatially adjacent invocations are in the same work group. So your work group size needs to be some multiple of 2x2 invocations. Then, you need a shared variable array, which you index by invocations within a work group. Each invocation should write its own value to its own index.

    To compute the derivative, issue a barrier (with memoryBarrierShared) after writing the values to the shared variables. Take the difference between one's invocation and the adjacent one in the same 2x2 quad. You should make sure that all invocations in the same quad get the same value, by always subtracting between the lower index and the higher index within the quad. Something like this:

    uvec2 quadIndex = gl_LocalInvocationID.xy / 2
    /*type*/ derFdX = variable[quadIndex.x + 1][quadIndex.y + 0] - variable[quadIndex.x + 0][quadIndex.y + 0]
    /*type*/ derFdY = variable[quadIndex.x + 0][quadIndex.y + 1] - variable[quadIndex.x + 0][quadIndex.y + 0]
    

    The NVIDIA extension basically does this for you, though it's probably more efficient since it wouldn't need the shared variable.