Search code examples
javaprotocol-buffersproto

Identify ProtoBuf class from byte array


I am writing a program that works on two proto messages, I need to process the byte[] sent from different sources which sends either foo message or bar message. Since I cannot figure out which message it belongs to, I used Any Class (comes along with protobuf) to parse the byte array and find which class it belongs to, but met with a compile time error. Is there any other method that I can use to detect if I add more proto message classes in future?

//Foo.proto

syntax = "proto3";

option java_outer_classname = "FooProto";

message Foo {
    int32 a = 1;
}

and the second proto

//Bar.proto
syntax = "proto3";

option java_outer_classname = "BarProto";

message Bar {
    int32 b = 1;
}

Code:

Any anyEvent = Any.parseFrom(protoBytes);
if (any.is(Foo.class)
{
  Foo foo = any.unpack(Foo.class);
  ...
} else {
  Bar bar = any.unpack(Bar.class);
  ...
}

Error in if statement while trying to invoke any.is() :

The method is(Class< T>) in the type Any is not applicable for the arguments (Class< Foo>)


Solution

  • Any doesn't mean "any"; it means "a type serialized via Any". If you didn't store it with Any: you can't decode it via Any.

    The key point here is that protobuf does not include type metadata in a message payload. If you have a BLOB, you usually can't know what the message type is. Any solves that by encoding the message type in a wrapper message, but you won't have that here.


    If your design is to have an API that accepts two different non-Any message types without prior knowledge of which it is: you probably have a bad design. Because that doesn't work with protobuf. On the wire, there is literally no difference between a Foo with a=42 and a Bar with b=42; the payloads are identical:

    Foo with a=42 is the bytes 08 2A; Bar with b=42 is the bytes 08 2A. The 08 means "field 1, encoded as a varint", the 2A is a varint with raw value 42.

    A better design might be a wrapper message specific to your scenario:

    message RootMessage {
      oneof test_oneof {
         Foo foo = 1;
         Bar bar = 2;
      }
    }
    

    This adds a wrapper layer similar to how Any works, but much more efficiently - it just knows how to distinguish between your known types as an integer, rather than having to handle every possible type (as a rooted type name).