Search code examples
javaprotocol-buffersapache-flinkamazon-kinesis

issue deserializing events in protobuf events in apache flink


I am reading events from kinesis in my flink app. the events are in protobuf format. if i use 'com.google.protobuf:protobuf-java:3.7.1' with in the flink app i've no issues. however if i change that to 'com.google.protobuf:protobuf-java:3.10.0' i get the above exception with stack trace

java.lang.IncompatibleClassChangeError: class com.google.protobuf.Descriptors$OneofDescriptor has interface com.google.protobuf.Descriptors$GenericDescriptor as super class
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:763)
        at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
        at java.net.URLClassLoader.defineClass(URLClassLoader.java:468)
        at java.net.URLClassLoader.access$100(URLClassLoader.java:74)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:369)
        at java.net.URLClassLoader$1.run(URLClassLoader.java:363)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:362)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:424)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:357)
        at java.lang.Class.getDeclaredMethods0(Native Method)
        at java.lang.Class.privateGetDeclaredMethods(Class.java:2701)
        at java.lang.Class.privateGetPublicMethods(Class.java:2902)
        at java.lang.Class.privateGetPublicMethods(Class.java:2917)
        at java.lang.Class.getMethods(Class.java:1615)
        at org.apache.flink.api.java.typeutils.TypeExtractor.isValidPojoField(TypeExtractor.java:1786)
        at org.apache.flink.api.java.typeutils.TypeExtractor.analyzePojo(TypeExtractor.java:1856)
        at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1746)
        at org.apache.flink.api.java.typeutils.TypeExtractor.privateGetForClass(TypeExtractor.java:1643)
        at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfoWithTypeHierarchy(TypeExtractor.java:921)
        at org.apache.flink.api.java.typeutils.TypeExtractor.privateCreateTypeInfo(TypeExtractor.java:781)
        at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfo(TypeExtractor.java:735)
        at org.apache.flink.api.java.typeutils.TypeExtractor.createTypeInfo(TypeExtractor.java:731)
        at org.apache.flink.api.common.typeinfo.TypeInformation.of(TypeInformation.java:211)
        at org.apache.flink.api.java.typeutils.ListTypeInfo.<init>(ListTypeInfo.java:45)
        at com.bagi.streaming.serialization.ProtoSchema.getProducedType(ProtoSchema.java:40)
        at org.apache.flink.streaming.connectors.kinesis.serialization.KinesisDeserializationSchemaWrapper.getProducedType(KinesisDeserializationSchemaWrapper.java:57)
        at org.apache.flink.streaming.connectors.kinesis.FlinkKinesisConsumer.getProducedType(FlinkKinesisConsumer.java:363)
        at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.addSource(StreamExecutionEnvironment.java:1456)
        at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.addSource(StreamExecutionEnvironment.java:1414)
        at org.apache.flink.streaming.api.environment.StreamExecutionEnvironment.addSource(StreamExecutionEnvironment.java:1396)
        at com.bagi.streaming.StreamProcessor.getKinesisTrackingStream(StreamProcessor.java:101)
        at com.bagi.streaming.StreamProcessor.getKinesisTrackingStream(StreamProcessor.java:110)
        at com.bagi.streaming.StreamProcessor.consumeKinesis(StreamProcessor.java:117)
        at com.bagi.streaming.StreamProcessor.main(StreamProcessor.java:80)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:498)
        at org.apache.flink.client.program.PackagedProgram.callMainMethod(PackagedProgram.java:529)
        at org.apache.flink.client.program.PackagedProgram.invokeInteractiveModeForExecution(PackagedProgram.java:421)
        at org.apache.flink.client.program.ClusterClient.run(ClusterClient.java:423)
        at org.apache.flink.client.cli.CliFrontend.executeProgram(CliFrontend.java:813)
        at org.apache.flink.client.cli.CliFrontend.runProgram(CliFrontend.java:287)
        at org.apache.flink.client.cli.CliFrontend.run(CliFrontend.java:213)
        at org.apache.flink.client.cli.CliFrontend.parseParameters(CliFrontend.java:1050)
        at org.apache.flink.client.cli.CliFrontend.lambda$main$11(CliFrontend.java:1126)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1844)
        at org.apache.flink.runtime.security.HadoopSecurityContext.runSecured(HadoopSecurityContext.java:41)
        at org.apache.flink.client.cli.CliFrontend.main(CliFrontend.java:1126)

i am using [email protected] and 'com.twitter:chill-protobuf:0.9.3'. i am building flink app jar locally on my mac. i've tried using protoc at both 3.10.0 and 3.7.1 for protobuf-java at 3.10.0 in case that matters.

here is my deserializer

public class ProtoSchema implements DeserializationSchema<List<Event>> {

    @Override
    public List<Event> deserialize(byte[] message) throws IOException {

        List<Event> events = new LinkedList<>();
        InputStream inputStream = new ByteArrayInputStream(message);

        while (true) {
            Event event = Event.parseDelimitedFrom(inputStream);
            if (event != null) {
                events.add(event);
            } else {
                break;
            }
        }
        return events;
    }

    @Override
    public boolean isEndOfStream(List<Event> nextElement) {
        return false;
    }

    @Override
    public TypeInformation<List<Event>> getProducedType() {
        return new ListTypeInfo<>(Event.class);
    }
}

which i am plugging in by doing

StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();

Properties consumerConfig = new Properties();
consumerConfig.put(AWSConfigConstants.AWS_CREDENTIALS_PROVIDER, "AUTO");
consumerConfig.put(AWSConfigConstants.AWS_REGION, region);
consumerConfig.put(ConsumerConfigConstants.SHARD_GETRECORDS_INTERVAL_MILLIS, "300");
consumerConfig.put(ConsumerConfigConstants.SHARD_GETRECORDS_RETRIES, "10");
consumerConfig.put(ConsumerConfigConstants.SHARD_GETRECORDS_MAX, "5000");
consumerConfig.put(ConsumerConfigConstants.STREAM_INITIAL_POSITION, "LATEST");

env.addSource(new FlinkKinesisConsumer<>(name, new ProtoSchema(), consumerConfig)).name("KinesisSource");
env.getConfig().registerTypeWithKryoSerializer(Event.class, ProtobufSerializer.class);

Event.class is compiled from protobuf schema using [email protected] and [email protected]


Solution

  • As you said in comment from protobuf-java:3.9.0 there is binary incompatible change to lower versions (3.8-).

    to class class Descriptors.OneofDescriptor added super-class Descriptors.GenericDescriptor, which A static field from a super-interface of a client class may hide a field (with the same name) inherited from new super-class and cause IncompatibleClassChangeError exception. More

    So if you have on your classpath protobuf-java:3.9.0+ and also some lower version (3.8-) call this class you will got this error. (In my case it went from hadoop which has 2.5 protobuf-java version and my fat jar with 3.10)

    Solution:

    1. You need to shade one of the incompatible dependencies protobuf-java more how to shade depedency with gradle
    2. Or use version 3.8 and lower as temporary shortsighted solution.