Interpretation of the VK_FORMAT_R16G16_S10_5_NV format

The Vulkan spec sais:

VK_FORMAT_R16G16_S10_5_NV specifies a two-component, fixed-point format where most significant bit specifies the sign bit, next 10 bits specify the integer value and last 5 bits represent the fractional value.

I'm not sure if this means it's in 2's complement format or not? So for example, -1.5 is represented as:

0b1'0000000001'10000 (i.e. sign = 1, 10bit integer part = 1, 5bit fractional part = 16)

0b1'1111111110'10000 (i.e. 11bit 2's complement of 1, 5bit fractional part = 16, or in other words 2's complement of 48)

Solution

Simply put it's 2s complement.

In NVidia's SDK samples, they have a few hints. In NvOFUtilsVulkan.cpp they have a format conversion function which correlates VK_FORMAT_R16G16_S10_5_NV to NV_OF_BUFFER_FORMAT_SHORT2. In NvOF.cpp they correlate the NV_OF_BUFFER_FORMAT_SHORT2 format to the NV_OF_FLOW_VECTOR structure. The documentation for that structure describes the format as S10.5, and in a forum post someone from NVidia references this Introduction to Fixed Point, which describes a 2s complement format.

In the sample they also have a (broken) convertFloat2Fixed function in NvOFDataLoader.cpp.

void NvOFDataLoaderFlo::convertFloat2Fixed(const float* pFlowFloat, NV_OF_FLOW_VECTOR* pFlowFixed)
{
    for (uint32_t y = 0; y < m_height; ++y)
    {
        for (uint32_t x = 0; x < m_width; ++x)
        {
            pFlowFixed[y * m_width + x].flowx = static_cast<uint16_t>(pFlowFloat[(y * 2 * m_width) + 2 * x] * m_precision);
            pFlowFixed[y * m_width + x].flowy = static_cast<uint16_t>(pFlowFloat[(y * 2 * m_width) + 2 * x + 1] * m_precision);
        }
    }
}

Given this is heavily intended to run on an x86 system, this strongly points to it being 2s complement. That said, cast to an int16_t first, as casting a negative float to an unsigned integer is UB and actively breaks when done with negative constants.

It should also noted, that despite there being 5 fractional bits, only 2 are used.

And Last 5 bits represents fractional value of the flow vector. Note that, though we are using 5 bits for fractional value, we only support quarter pixel precision. It means, of the 5 available bits, only 2 bits are used.