I try to read .pb extension file. Specifically, I would like to read this dataset (in .tgz).
I write the following code:
Path path = Paths.get(filename);
byte[] data = Files.readAllBytes(path);
Document document = Document.parseFrom(data);
But then I received the following error.
com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.
The last line of the code caused this error, but I do not know how to solve it.
Your files are actually in "delimited" format: each one contains multiple messages, each with a length prefix.
InputStream stream = new FileInputStream(filename);
Document document = Document.parseDelimitedFrom(stream);
Keep calling parseDelimitedFrom(stream)
to read more messages until it returns null
(end of file).
Also note that the file I looked at -- testNegative.pb
in heldout_relations.tgz
-- appeared to contain instances of Relation
, not Document
. Make sure you are parsing the correct type, because the protobuf implementation can't tell the difference -- you'll get garbage if you parse the wrong type.