c++winapi bit bit-shift ms-media-foundation

Add bit padding (bit shifting?) to 10bit values stored in a byte array

I'm looking for an efficient way to bit shift left (<<) 10 bit values that are stored within a byte array using C++/Win32.

I am receiving an uncompressed 4:2:2 10 bit video stream via UDP, the data is stored within an unsigned char array due to the packaging of the bits.

The data is always sent so that groups of pixels finish on a byte boundary (in this case, 4 pixels sampled at a bit-depth of 10 use 5 bytes):

The renderer I am using (Media Foundation Enhanced Video Renderer) requires that 10 bit values are placed into a 16 bit WORD with 6 padding bits to the right, whilst this is annoying I assume it's to help them ensure a 1-byte memory alignment:

What is an efficient way of left shifting each 10 bit value 6 times (and moving to a new array if needed)? Although I will be receiving varying lengths of data, they will always be comprised of these 40 bit blocks.

I'm sure a crude loop would suffice with some bit-masking(?) but that sounds expensive to me and I have to process 1500 packets/second, each with ~1200 bytes of payload.

Edit for clarity

Example Input:

unsigned char byteArray[5] = {0b01110101, 0b01111010, 0b00001010, 0b11111010, 0b00000110}

Desired Output:

WORD wordArray[4] = {0b0111010101000000, 0b1110100000000000, 0b1010111110000000, 0b1000000110000000}

(or the same resulting data in a byte array)

Solution

This does the job:

void ProcessPGroup(const uint8_t byteArrayIn[5], uint16_t twoByteArrayOut[4])
{
    twoByteArrayOut[0] = (((uint16_t)byteArrayIn[0] & 0b11111111u) << (0 + 8)) | (((uint16_t)byteArrayIn[1] & 0b11000000u) << 0);
    twoByteArrayOut[1] = (((uint16_t)byteArrayIn[1] & 0b00111111u) << (2 + 8)) | (((uint16_t)byteArrayIn[2] & 0b11110000u) << 2);
    twoByteArrayOut[2] = (((uint16_t)byteArrayIn[2] & 0b00001111u) << (4 + 8)) | (((uint16_t)byteArrayIn[3] & 0b11111100u) << 4);
    twoByteArrayOut[3] = (((uint16_t)byteArrayIn[3] & 0b00000011u) << (6 + 8)) | (((uint16_t)byteArrayIn[4] & 0b11111111u) << 6);
}

Don't be confused by the [5] and [4] values in the function signature above. They don't do anything except tell you, the user, that that is the mandatory, expected number of elements in each array. See my answer here on this: Passing an array as an argument to a function in C. Passing an array that is shorter will result in undefined behavior and is a bug!

Full test code (download it in my eRCaGuy_hello_world repo here: cpp/process_10_bit_video_data.cpp):

test.cpp

/*

GS
17 Mar. 2021

To compile and run:
    mkdir -p bin && g++ -Wall -Wextra -Werror -ggdb -std=c++17 -o bin/test \
    test.cpp && bin/test

*/

#include <bitset>
#include <cstdint>
#include <cstdio>
#include <cstring>
#include <iostream>

// Get the number of elements in any C array
// - Usage example: [my own answer]:
//   https://arduino.stackexchange.com/questions/80236/initializing-array-of-structs/80289#80289
#define ARRAY_LEN(array) (sizeof(array)/sizeof(array[0]))

/// \brief      Process a packed video P group, which is 4 pixels of 10 bits each (exactly 5 uint8_t
///             bytes) into a uint16_t 4-element array (1 element per pixel).
/// \details    Each group of 10-bits for a pixel will be placed into a 16-bit word, with all 10
///             bits left-shifted to the far left edge, leaving 6 empty (zero) bits in the right
///             side of the word.
/// \param[in]  byteArrayIn  5 bytes of 10-bit pixel data for exactly 4 pixels; any array size < 5
///                        will result in undefined behavior! So, ensure you pass the proper array
///                        size in!
/// \param[out] twoByteArrayOut  The output array into which the 4 pixels will be packed, 10 bits per
///                        16-bit word, all 10 bits shifted to the left edge; any array size < 4
///                        will result in undefined behavior!
/// \return     None
void ProcessPGroup(const uint8_t byteArrayIn[5], uint16_t twoByteArrayOut[4])
{
    twoByteArrayOut[0] = (((uint16_t)byteArrayIn[0] & 0b11111111u) << (0 + 8)) | (((uint16_t)byteArrayIn[1] & 0b11000000u) << 0);
    twoByteArrayOut[1] = (((uint16_t)byteArrayIn[1] & 0b00111111u) << (2 + 8)) | (((uint16_t)byteArrayIn[2] & 0b11110000u) << 2);
    twoByteArrayOut[2] = (((uint16_t)byteArrayIn[2] & 0b00001111u) << (4 + 8)) | (((uint16_t)byteArrayIn[3] & 0b11111100u) << 4);
    twoByteArrayOut[3] = (((uint16_t)byteArrayIn[3] & 0b00000011u) << (6 + 8)) | (((uint16_t)byteArrayIn[4] & 0b11111111u) << 6);
}

// Reference: https://stackoverflow.com/questions/7349689/how-to-print-using-cout-a-number-in-binary-form/7349767
void PrintArrayAsBinary(const uint16_t* twoByteArray, size_t len)
{
    std::cout << "{\n";
    for (size_t i = 0; i < len; i++)
    {
        std::cout << std::bitset<16>(twoByteArray[i]);
        if (i < len - 1)
        {
            std::cout << ",";
        }
        std::cout << std::endl;
    }
    std::cout << "}\n";
}

int main()
{
    printf("Processing 10-bit video data example\n");

    constexpr uint8_t TEST_BYTE_ARRAY_INPUT[5] = {0b01110101, 0b01111010, 0b00001010, 0b11111010, 0b00000110};
    constexpr uint16_t TEST_TWO_BYTE_ARRAY_OUTPUT[4] = {
        0b0111010101000000, 0b1110100000000000, 0b1010111110000000, 0b1000000110000000};

    uint16_t twoByteArrayOut[4];
    ProcessPGroup(TEST_BYTE_ARRAY_INPUT, twoByteArrayOut);

    if (std::memcmp(twoByteArrayOut, TEST_TWO_BYTE_ARRAY_OUTPUT, sizeof(TEST_TWO_BYTE_ARRAY_OUTPUT)) == 0)
    {
        printf("TEST PASSED!\n");
    }
    else
    {
        printf("TEST ==FAILED!==\n");

        std::cout << "expected = \n";
        PrintArrayAsBinary(TEST_TWO_BYTE_ARRAY_OUTPUT, ARRAY_LEN(TEST_TWO_BYTE_ARRAY_OUTPUT));

        std::cout << "actual = \n";
        PrintArrayAsBinary(twoByteArrayOut, ARRAY_LEN(twoByteArrayOut));
    }

    return 0;
}

Sample run and output:

$ mkdir -p bin && g++ -Wall -Wextra -Werror -ggdb -std=c++17 \
-o bin/test test.cpp && bin/test
Processing 10-bit video data example
TEST PASSED!

I've now also placed this code into my eRCaGuy_hello_world repo here: cpp/process_10_bit_video_data.cpp.

References:

How to print (using cout) a number in binary form?
[my answer] Passing an array as an argument to a function in C
[my eRCaGuy_hello_world repo] ARRAY_LEN() macro: see utilities.h
https://en.cppreference.com/w/cpp/string/byte/memcmp

_{Keywords: c and c++ bitmasking and bit-shifting, bit-packing; bit-masking bit masking, bitshifting bit shifting, bitpacking bit packing, byte packing, lossless data compression}