Search code examples
c#deserializationprotobuf-netnetworkstream

C# Protobuf-net: how do I deserialize consecutively from a network stream?


I am (gratefully) using Marc Gravell's excellent Protobuf-net Protocol Buffers library. Unfortunately, I am being unintelligent and having trouble understanding the correct way to deserialize objects that come in over the wire.

As a basis for my effort, I am following the advice in these questions:

Issue deserializing (protocolBuffer) serialized data using protobuf-net
protobuf: consecutive serialize and deserialize to/from socket

I am holding a TCP connection open for an entire session (minutes... hours... who knows?), and using a single network stream that persists as long as the connection does. I am receiving many PB messages over this stream.

Given that I am using "TryDeserializeWithLengthPrefix", I had expected the library to sort out waiting on the stream to have enough data to parse a whole object. But this does not appear to be the case.

I wrote a simple test:

    [Test]
    public void DeserializeSimpleObjectInParts ()
    {
        SimpleObject origin = new SimpleObject();
        byte[] wholeObj = ProtobufSerializer.SerializeToByteArray ( origin );
        int length = wholeObj.Length;
        int midpoint = length / 2;
        int sizeA = midpoint;
        int sizeB = length - midpoint;
        byte[] aaa = new byte[sizeA];
        byte[] bbb = new byte[sizeB];
        for ( int i = 0; i < midpoint; i++ )
        {
            aaa [ i ] = wholeObj [ i ];
        }
        for ( int j = midpoint; j < length; j++ )
        {
            bbb [ j - midpoint ] = wholeObj [ j ];
        }

        using ( MemoryStream streamA = new MemoryStream ( aaa ) )
        {
            using ( MemoryStream streamB = new MemoryStream ( bbb ) )
            {
                streamA.Position = 0;
                streamB.Position = 0;
                Debug.LogDebug ( "streamA.Length = " + streamA.Length + ", streamB.Length = " + streamB.Length );

                object resultA = null;
                bool succeededA = false;
                try {
                    succeededA = ProtobufSerializer.Deserialize ( streamA, out resultA );
                }
                catch ( Exception e )
                {
                    Debug.LogDebug ( "Exception = " + e );
                    Debug.LogDebug ( "streamA.Position = " + streamA.Position + ", streamA.Length = " + streamA.Length );
                }
            }
        }
    }

Please note that, in the above, "ProtobufSerializer.Deserialize" is simply a call through my own service class straight into "TryDeserializeWithLengthPrefix". The type coding business is in there.

When I run this test, the following exception is thrown:

System.IO.EndOfStreamException: Failed to read past end of stream.

This advice from our friends at Google suggests it is my responsibility to take a look at the length prefix in my socket code. But how does that help? C# network streams don't have a Length property for me to compare against.

Do I really have to do an intermediate step of running a loop over the stream, building byte arrays of the correct size, then making new streams from them which I pass into Protobuf-net? That seems like an anti-pattern and a performance hit.

What am I missing here?


Solution

  • Given that I am using "TryDeserializeWithLengthPrefix", I had expected the library to sort out waiting on the stream to have enough data to parse a whole object.

    There's no reliable and efficient way to "sort out waiting" on an abstract Stream object.

    C# network streams don't have a Length property for me to compare against.

    Count the data you write from the socket stream to the buffer stream. That'll be your "Length" property to compare against the length prefix which you get in the beginning. Once you have enough data, seek the buffer stream back to the prefix and pass it to the deserializer to get your object. Now seek the buffer stream back again and reuse it to get the next message. Repeat.