Search code examples
c#.net-3.5

Reading bit-aligned data


I have been reading the SWF format available on Adobe's site and it mentions that in order to save space, variable bits are used to store integers or floats (page 17 in the pdf)

I have always worked with byte-aligned data so have not given much thought to files that are bit-aligned, or have variable alignment where the information is stored in each byte.

So for example, you may have a struct containing four 13-bit integers stored sequentially ( rather than storing them as four 16-bit integers).

The first 13bits is the first integer, the next 13 bits is the second integer, and so on. It pads the last byte appropriate to make the struct byte-aligned with the rest of the file, so 52-bits would be padded to 56-bits, requiring 7 bytes to store those four integers as opposed to 8 bytes.

  • How do I approach this kind of problem?
  • How can I work with a stream of bytes at the bit-level?
  • Is there something I can use to help make it easier to work with this data?

I imagine the solution boils down to using bit-operations on byte arrays.

An example solution for parsing the four 13-bit integers would be nice as well to demonstrate the use of your suggested method.


Solution

  • There are two ways of dealing with this that I know of. The first is to manually do it - using bit-wise operators, division, modulus etc. on byte arrays [or integer/ulong etc if you were bored]. IsBitSet Example

    The other way is a BitArray - which handles most of this for you :)


    It would be nice to add an example of how exactly BitArray handles getting bits 13..25 as an int, as that would be the primary operation. At a first glance I see only a loop.

    Fine... I wrote a quick & dirty test proof of concept:

    var rnd = new Random();
    //var data = Enumerable.Range(0, 10).ToArray();
    var data = Enumerable.Range(0, 10).Select(x => rnd.Next(1 << 13)).ToArray();
    
    foreach (var n in data) Console.WriteLine(n);
    
    Console.WriteLine(new string('-', 13));
    
    var bits = new BitArray(data.Length * 13);
    
    for (int i = 0; i < data.Length; i++)
    {
        var intBits = new BitArray(new[] { data[i] });
        for (int b = 12; b > -1; b--)
        {
            bits[i * 13 + b] = intBits[b];
            Console.Write(intBits[b] ? 1 : 0);
        }
        Console.WriteLine();
    }
    Console.WriteLine(new string('-', 13));
    
    for (int i = 0; i < bits.Length / 13; i++)
    {
        int number = 0;
        for (int b = 12; b > -1; b--)
            if (bits[i * 13 + b])
                number += 1 << b;
    
        Console.WriteLine(number);
    }
    Console.ReadLine();
    

    Which outputs:

    910
    3934
    7326
    7990
    7712
    1178
    6380
    3460
    5113
    7489
    -------------
    0001110001110
    0111101011110
    1110010011110
    1111100110110
    1111000100000
    0010010011010
    1100011101100
    0110110000100
    1001111111001
    1110101000001
    -------------
    910
    3934
    7326
    7990
    7712
    1178
    6380
    3460
    5113
    7489
    

    The bit array doesn't do much other than simplify accessing - it's still quite manual. I expect you'd write your own classes to simply this and make it neat and reusable - for example here's another quick concept:

    //Improved to take sign into account.
    //Sign is in addition to bits allocated for storage in this version.
    //Stored as {sign}{bits}
    //E.g.  -5, stored in 3 bits signed is:
    //       1 101
    //E.g.   5, stored in 3 bits [with sign turned on]
    //       0 101
    //E.g.   5, stored in 3 bits no sign
    //         101  
    //This may differ from your exiting format - e.g. you may use two's compliments.
    static void Main(string[] args)
    {
        int bitsPerInt = 13;
    
        //Create your data
        var rnd = new Random();
        //var data = Enumerable.Range(-5, 10).ToArray();
        var data = Enumerable.Range(0, 10).Select(x => rnd.Next(-(1 << bitsPerInt), 1 << bitsPerInt)).ToArray();
    
        var bits = new BitSerlializer();
    
        //Add length header
        bits.AddInt(data.Length, 8, false);
        foreach (var n in data)
        {
            bits.AddInt(n, bitsPerInt);
            Console.WriteLine(n);
        }
    
        //Serialize to bytes for network transfer etc.
        var bytes = bits.ToBytes();
    
        Console.WriteLine(new string('-', 10));
        foreach (var b in bytes) Console.WriteLine(Convert.ToString(b, 2).PadLeft(8, '0'));
        Console.WriteLine(new string('-', 10));
    
        //Deserialize
        bits = new BitSerlializer(bytes);
        //Get Length Header
        var count = bits.ReadInt(8, false);
        for (int i = 0; i < count; i++)
            Console.WriteLine(bits.ReadInt(bitsPerInt));
    
        Console.ReadLine();
    }
    
    public class BitSerlializer
    {
        List<byte> bytes;
        int Position { get; set; }
    
        public BitSerlializer(byte[] initialData = null)
        {
            if (initialData == null)
                bytes = new List<byte>();
            else
                bytes = new List<byte>(initialData);
        }
    
        public byte[] ToBytes() { return bytes.ToArray(); }
    
        public void Addbit(bool val)
        {
            if (Position % 8 == 0) bytes.Add(0);
            if (val) bytes[Position / 8] += (byte)(128 >> (Position % 8));
            Position++;
        }
    
        public void AddInt(int i, int length, bool isSigned = true)
        {
            if (isSigned) Addbit(i < 0);
            if (i < 0) i = -i;
    
            for (int pos = --length; pos >= 0; pos--)
            {
                var val = (i & (1 << pos)) != 0;
                Addbit(val);
            }
        }
    
        public bool ReadBit()
        {
            var val = (bytes[Position / 8] & (128 >> (Position % 8))) != 0;
            ++Position;
            return val;
        }
    
        public int ReadInt(int length, bool isSigned = true)
        {
            var val = 0;
            var sign = isSigned && ReadBit() ? -1 : 1;
    
            for (int pos = --length; pos >= 0; pos--)
                if (ReadBit())
                    val += 1 << pos;
    
            return val * sign;
        }
    }