I'm seeing some logs within my flink app with respect to my thrift classes:
2020-06-01 14:31:28 INFO TypeExtractor:1885 - Class class com.test.TestStruct contains custom serialization methods we do not call, so it cannot be used as a POJO type and must be processed as GenericType. Please read the Flink documentation on "Data Types & Serialization" for details of the effect on performance.
So I followed the instructions here:
And I did that for the thrift of TestStruct
along with all the thrift structs within that. ( I've skipped over named types though ).
Also the thrift code that got generated is in Java whereas the flink app is written using scala.
How would I make that error disappear? Because I'm getting another bug where if I pass my dataStream to convert into that TestStruct
, some fields are missing. I suspect this is due to serialization issues?
Actually, as of now, you can't get rid of this warning, but it is also not a problem for the following reason:
The warning basically just says that Flink's type system is not using any of its internal serializers but will instead treat the type as a "generic type" which means, it is serialized via Kryo. If you followed my blog post on this, this is exactly what you want: use Kryo to serialize via Thrift. You could use a debugger to set a breakpoint into TBaseSerializer
to verify that Thrift is being used.
As for the missing fields, I would suspect that this happens during the conversion into your TestStruct
in your (flat)map operator and maybe not in the serialization that is used to pass this struct to the next operator. You should verify where these fields get missing - if you have this reproducible, a breakpoint in the debugger of your favourite IDE should help you find the cause.