Search code examples
directxhlsl

float4 vs 4 floats in directX


Which is faster in directX when sending data to the vertex shader.

struct VertexInputType
{
    float4 data : DATA; // x,y - POSITION, z - distance, w - size
}

vs

struct VertexInputType
{
    float2 pos : POSITION;
    float distance : DISTANCE;
    float size : SIZE;
}

A wild guess would be to say that first one is faster because it packs in a 128 bit register. But I am thinking there is a better answer.


Solution

  • If you are thinking about memory transfer between CPU and GPU: If these are all coming from the same buffer object then it shouldn't matter. The second one is just an interpretation of the data that is known to the shader, and it has nothing to do with actual data being transferred. In case of using multiple vertex streams in case 2 it might have different performance but this difference is not connected with the format used in the shader.

    If you are worried about vertex cache efficiency: In both cases 16 bytes will be stored and retrieved per vertex so there is no difference here either.