I have a list of about 500 million items. I am able to serialize this into a file with protobuf-net file if I serialize individual items, not a list -- I cannot collect the items into List of Price and then serialize because I run out of memory. So, I have to serialize one record at a time:
using (var input = File.OpenText("..."))
using (var output = new FileStream("...", FileMode.Create, FileAccess.Write))
{
string line = "";
while ((line = input.ReadLine()) != null)
{
Price price = new Price();
(code that parses input into a Price record)
Serializer.Serialize(output, price);
}
}
My question is about deserialization part. It appears that Deserialize method does not move the Position of the stream to the next record. I tried:
using (var input = new FileStream("...", FileMode.Open, FileAccess.Read))
{
Price price = null;
while ((price = Serializer.Deserialize<Price>(input)) != null)
{
}
}
I see one real-looking Price record, and then the rest are empty records -- I get the Price object back but all fields are initialized to default values.
How to properly deserialize a stream that contains a list of objects which are not serialized as a list?
Good news! The protobuf-net API is setup for exactly this scenario. You should see a SerializeItems and DeserializeItems pair of methods that work with IEnumerable<T>
, allowing streaming both in and out. The easiest way to do feed it an enumerate is via an "iterator block" over the source data.
If, for whatever reason, that isn't convenient, that is 100% identical to using SerializeWithLengthPrefix and DeserializeWithLengthPrefix on a per-item basis, specifying (as parameters) field: 1 and prefix-style: base-128. You could even use SerializeWithLengthPrefix for the writing, and DeserializeItems for the reading (as long as you use field 1 and base-128).
Re the example - id have to see that in a fully reproducible scenario to comment; actually, what I would expect there is that you only get a single object back out, containing the combined values from each object - because without the length-prefix, the protobuf spec assumes you are just concatenating values to a single object. The two approaches mentioned above avoid this issue.