Search code examples
pythonbit-manipulationbytebitnibble

Effienctly unpack mono12packed bitstring format with python


I have raw data from a camera, which is in the mono12packed format. This is an interlaced bit format, to store 2 12bit integers in 3 bytes to eliminate overhead. Explicitly the memory layout for each 3 bytes looks like this:

Byte 1 = Pixel0 Bits 11-4
Byte 2 = Pixel1 Bits 3-0 + Pixel0 Bits 3-0
Byte 3 = Pixel1 Bits 11-4

I have a file, where all the bytes can be read from using binary read, let's assume it is called binfile.

To get the pixeldata from the file I do:

from bitstring import BitArray as Bit

f = open(binfile, 'rb')
bytestring = f.read()
f.close()
a = []
for i in range(len(bytestring)/3): #reading 2 pixels = 3 bytes at a time
    s = Bit(bytes = bytestring[i*3:i*3+3], length = 24)
    p0 = s[0:8]+s[12:16]
    p1 = s[16:]+s[8:12]
    a.append(p0.unpack('uint:12'))
    a.append(p1.unpack('uint:12'))

which works, but is horribly slow and I would like to do that more efficiently, because I have to do that for a huge amount of data.

My idea is, that by reading more than 3 bytes at a time I could spare some time in the conversion step, but I can't figure a way how to do that.

Another idea is, since the bits come in packs of 4, maybe there is a way to work on nibbles rather than on bits.

Data example:

The bytes

'\x07\x85\x07\x05\x9d\x06'

lead to the data

[117, 120, 93, 105]

Solution

  • Have you tried bitwise operators? Maybe that's a faster way:

    with open('binfile.txt', 'rb') as binfile:
      bytestring = list(bytearray(binfile.read()))
    
    
    a = []
    
    for i in range(0, len(bytestring), 3):
      px_bytes = bytestring[i:i+3]
      p0 = (px_bytes[0] << 4) | (px_bytes[1] & 0x0F)
      p1 = (px_bytes[2] << 4) | (px_bytes[1] >> 4 & 0x0F)
      a.append(p0)
      a.append(p1)
    
    print a
    

    This also outputs: [117, 120, 93, 105]

    Hope it helps!