Search code examples
c#arrays.netbytetlv

How to compress Length of 4 Bytes to fit into 1 Byte only of Tag-Length-Value stream?


I see this guy do something different for Tag-Length-Value, He allow to store large length > 255, As we see in TLV examples it working like that:

I need to store My ID + Name

Tag 1, Length 5, Value (ID) = 12345, and so on

Back again to what I need, How this guy/TlvEncoding compress the length to allow store more than 4 bytes. he something with AND, OR binary logical operators with 127 for example:

Assume that I have string of length 1000, Which exceed 1 byte. He do following

Note: 0xffff = 65,535 maximum value of UInt16 or ushort. I think condition when length 1000 will met on that line. and stream will write 2 bytes for length. But when he read that value he do more logical manipulation. which am already leaks. :(

    /// <summary>
    /// Write TLV length to stream
    /// </summary>
    /// <param name="stream">stream to write to</param>
    /// <param name="length">length to write or null to write indefinite length</param>
    public static void WriteLength(Stream stream, int? length)
    {
        if (length == null)
        {
            stream.WriteByte(0x80); // indefinite form
            return;
        }

        if (length < 0 || length > 0xffffffff)
            throw new TlvException(string.Format("Invalid length value: {0}", length));

        if (length <= 0x7f) // use short form if possible
        {
            stream.WriteByte(checked((byte)length));
            return;
        }

        byte lengthBytes;

        // use minimum number of octets
        if (length <= 0xff)
            lengthBytes = 1;
        else if (length <= 0xffff)
            lengthBytes = 2;
        else if (length <= 0xffffff)
            lengthBytes = 3;
        else if (length <= 0xffffffff)
            lengthBytes = 4;
        else
            throw new TlvException(string.Format("Length value too big: {0}", length));

        stream.WriteByte((byte)(lengthBytes | 0x80));

        // shift out the bytes
        for (var i = lengthBytes - 1; i >= 0; i--)
        {
            var data = (byte)(length >> (8 * i));
            stream.WriteByte(data);
        }
    }

In read operation he do: ( I think condition will met here 0x7f; // remove 0x80 bit) I don't know exactly why he choose 0x7f which equals 127, and he remove 0x80 that equal 128.

    /// <summary>
    /// Read TLV length from stream
    /// </summary>
    /// <param name="stream">Stream to read</param>
    /// <returns>length or null to indicate indefinite length</returns>
    public static int? ReadLength(Stream stream)
    {
        var readByte = stream.ReadByte();
        if (readByte == -1)
            throw new TlvException("Unexpected end of stream while reading length");

        if ((readByte & 0x80) == 0)
            return (int)readByte; // length is in first byte

        int length = 0;
        var lengthBytes = readByte & 0x7f; // remove 0x80 bit

        if (lengthBytes == 0)
            return null; // indefinite form

        if (lengthBytes > 4)
            throw new TlvException($"Unsupported length: {lengthBytes} bytes");

        for (var i = 0; i < lengthBytes; i++)
        {
            readByte = stream.ReadByte();
            if (readByte == -1)
                throw new TlvException("Unexpected end of stream while reading length");

            length <<= 8;
            length |= (int)readByte;
        }

        return length;
    }

Please I just need to understand how things going on there, Because I need to apply same method in an array of bytes. I need to store i.e: string of 1500 length, datetime, some integers, as (TLVs). But All I know is how to apply only 1 byte length (255).

So I can only read length of 1 byte, Because I don't know how to tell array that I need to seek for length of 3 bytes? or 2 bytes? I must then store all TLVs with 2 or 3 bytes. That a waste of space.

  1. Integer stored as 4 bytes (OK), So I can write Length is (byte) 4 (note byte cast)
  2. String can stored only in 255, But how to do it like above guy? 1500 length in 3 bytes?

Simply I explain it, He allow to store 1 byte as length, sometimes 3 bytes as length. I don't even know how he can tell compiler/stream to read this tag length from next 3 bytes. or next 2 bytes. or 1 byte.


Solution

  • From WriteLength, it looks fairly simple.

    Values between 0 and 0x7F inclusive are just represented by a single byte with that value. So if you want to write e.g. the value 5, you write a byte with the value 5.

    For values > 0x7F:

    1. The first byte is the number of bytes needed to represent the value, with the highest-bit set (so that you can tell the difference between it and a simple byte which holds a value between 0-127)
    2. The next however-many bytes hold the actual value.
    Value Serialized
    0 0x00
    1 0x01
    127 0x7F
    128 0x81 0x80
    129 0x81 0x81
    255 0x81 0xFF
    256 0x82 0x01 0x00
    257 0x82 0x01 0x01
    65535 0x82 0xFF 0xFF
    65536 0x83 0x01 0x00 0x00 0x00

    And so on...