Search code examples
javaprotocol-buffers

java: Protocol message tag had invalid wire type error when reading .pb file


I try to read .pb extension file. Specifically, I would like to read this dataset (in .tgz).

I write the following code:

Path path = Paths.get(filename);
byte[] data = Files.readAllBytes(path);
Document document = Document.parseFrom(data);

But then I received the following error.

com.google.protobuf.InvalidProtocolBufferException: Protocol message tag had invalid wire type.

The last line of the code caused this error, but I do not know how to solve it.


Solution

  • Your files are actually in "delimited" format: each one contains multiple messages, each with a length prefix.

    InputStream stream = new FileInputStream(filename);
    Document document = Document.parseDelimitedFrom(stream);
    

    Keep calling parseDelimitedFrom(stream) to read more messages until it returns null (end of file).

    Also note that the file I looked at -- testNegative.pb in heldout_relations.tgz -- appeared to contain instances of Relation, not Document. Make sure you are parsing the correct type, because the protobuf implementation can't tell the difference -- you'll get garbage if you parse the wrong type.