Search code examples
sql-server-ceprotobuf-net

protobuf-net and sql server ce


We have been successfully using protobuf-net v1 in a compact framework application to handle serializing our objects for storage in a sql server ce database.

Recently we hit a roadblock apparently due to using too many types (if we don't serialize as many types the error goes away.) Ref: http://code.google.com/p/protobuf-net/issues/detail?id=50#c6

In desperation (we're supposed to be releasing soon) we downloaded v2 and have been using it (without pre-compiling the serializers). However, we are occassionally getting strange errors when deserializing data -- unknown wire-type 6 and an error reading an int-32 -- somehow it is getting an overflow error when casting to an int which doesn't make sense due to the fact that it was previously serialized using the same method...) It appears to me that we are getting some corruption of the binary data -- but we are simply storing in a varbinary field in sql server ce and pulling it back out.

Does anyone have any ideas how the binary data could be corrupted? (See code below)

FINAL FIX:

Please read Marc's answer for some background. The best I can tell the problem was with how the SetBinary method works -- it does not appear to clear out or truncate existing data -- so if the binary data being saved is smaller than the previous data junk is left at the end.

We fixed it by changing this:

if (buffer.Length > 0)
{
    record.SetBytes(insertSet.GetOrdinal(SerializedDataColumnName), 0, buffer, 0, buffer.Length);
}

to this:

if (buffer.Length > 0)
{
    record.SetValue(insertSet.GetOrdinal(SerializedDataColumnName), null);
    record.SetBytes(insertSet.GetOrdinal(SerializedDataColumnName), 0, buffer, 0, buffer.Length);
}

Thank you.

UPDATE: Code used to serialize to the DB (code suggestions welcome as well as problem areas):

command.CommandType = CommandType.TableDirect;
MemoryStream ms = null;
using (SqlCeResultSet insertSet = command.ExecuteResultSet(ResultSetOptions.Updatable))
{
    foreach (var item in items)
    {
        ms = new MemoryStream();
        Serializer.Serialize<T>(ms, item);
        var record = insertSet.CreateRecord();
        var buffer = ms.GetBuffer();
        if (buffer.Length > 0)
        {
            record.SetBytes(insertSet.GetOrdinal(SerializedDataColumnName), 0, buffer, 0, buffer.Length);
        }
        else
        {
            record.SetValue(insertSet.GetOrdinal(SerializedDataColumnName), null);
        }
        insertSet.Update();
    }
}
if (ms != null)
{
    ms.Dispose();
}

Code used to deserialize:

using (var ms = new MemoryStream())
{
    using (SqlCeResultSet recordSet = command.ExecuteResultSet(ResultSetOptions.Scrollable))
    {
        //var serializer = null; //ServiceDepository.TryGetProvider<TypeModel, T>();
        while (recordSet.Read())
        {
            if (!recordSet.IsDBNull(recordSet.GetOrdinal(SerializedDataColumnName)))
            {
                var count = recordSet.GetBytes(recordSet.GetOrdinal(SerializedDataColumnName), 0, null, 0, 1);
                var bytes = new byte[count];
                recordSet.GetBytes(recordSet.GetOrdinal(SerializedDataColumnName), 0, bytes, 0, (int)count);
                if (bytes.Length > 0)
                {
                    var ms2 = new MemoryStream(bytes);
                    item = Serializer.Deserialize<T>(ms2);
                }
            }
            if (item == null)
            {
                //handle 'empty' items -- there were no properties
                //  that needed to be serialized
                item = new T();
            }
            list.Add(item);
        }
    }
}

Solution

  • I can see a problem; you are asking the MemoryStream for GetBuffer and using the buffer's length. A likely problem here is that GetBuffer returns the oversized backing buffer; you should either call .ToArray() on the MemoryStream to get a correctly sized buffer, or if you don't want to allocate an extra array you can call GetBuffer() but you must only store the first memStream.Length bytes from that buffer; the rest should be considered garbage (it is most likely all zeros, but a leading zero is not valid in a protobuf field header).

    Now, this could just be part of the issue, but we should eliminate it first...