Search code examples
apache-flinkavroparquet

Why does Flink only have a keyValue sink writer for Avro?


I wonder why is there an AvroKeyValueSinkWriter for Flink, but there isn't a simple AvroSinkWriter with regular Schema (non key-value).

I use this to generate near-streaming Avro files, and I batch them once an hour to Parquet files. I use the BucktingSink of Flink.

The Key-Value Schema is giving me some hard time when generating Parquet, did I miss something? Thanks!


Solution

  • You will not find much help with anything Flink.

    The documentation relies on javadoc and the examples are almost one-liners, like word count and other nonsense.

    I have yet to see what a "pro" flink coder can do, to learn what the right way to do some of the simplest tasks. Reading from Kafka, parsing an avro or json record, then putting in specific data on a file system or hdfs would be great. You won't find any such examples.

    You would think that by now that searching the net for some solid complex examples would be available.

    Most of these projects require you reading through all the source code and try and figure out an approach.

    In the end it is just easier to Spring boot and jam code into a service than to buy into Flink, and to some degree Spark.

    Best of luck to you.