Search code examples
google-cloud-platformapache-flinkflink-streaminggoogle-cloud-pubsubgoogle-cloud-dataproc

What is the proper way to use Google Pub/Sub with Flink Streaming using Dataproc?


I'm trying to figure out the proper way to run Apache Flink on Dataproc and use Google Pub/Sub as a source/sink. When I create a Dataproc cluster, after applying flink initialization action to the most recent image 1.4, Flink 1.6.4 will be installed.

The problem is that flink-connector-gcp-pubsub is only available starting from Flink version 1.9.0.

So my question is what is the proper way to use all of this together? Should I build my own gce image with the latest Flink? Is there one already existing?


Solution

  • I solved this problem by running Flink 1.9.0 in Kubernetes. This way I do not depend on anybody and can run whatever version I need.