apache-flink apache-beam amazon-kinesis-analytics

Flink-Kafka Flink job reading kafka records during startup and failing to start on AWS-KDA

Running a Flink-Beam job on KDA (kakfa --> flink(beam) --> ElasticSearch) the simple job wont start on KDA and goes to infinite loop. The AWS KDA Support replied saying the Job reads records during startup which is the cause of failure.

The dockerized version of the app runs smooth with 3 taskmanagers in kubernetes but not on KDA. As KDA has 2 minute timeout to start a job.

By my understanding Flink starts reading records once the job starts, how do i reduce startup lesser than 2 minutes, as the job is very basic reading records from kafka and store to ES.

Solution

I resolved the issue, basically Beam uses direct runner as default.

it is important to set --runner=FlinkRunner to start your job as a flink job.

otherwise the job is in infinite loop of reading from kafka topic.

What are the benefits of Apache Beam over Spark/Flink for batch processing?
What is/are the main difference(s) between Flink and Storm?
FLINK - will SQL window flush the element on regular interval for processing
Difference between job, task and subtask in flink
Flink failed to deserialize JSON produced by Debezium
Flink serialization of java.util.List and java.util.Map
Flink webUI - GC time
Where the Upsert Kafka connector consumer start?
The implementation of the AbstractRichFunction is not serializable when using JDBC Sink in Flink
Flink standalone mode takes too long to start
Limiting the state size in flink
Immediate CEP Event Trigger Issue with WatermarkStrategy in Flink 1.16.1
Connect a stream with watermarks with another one without watermarks in Flink
Read a keyed Kafka Record using apache Flink?
Error in Flink process Kafka topic:java.net.ConnectException: Connection refused (Connection refused)
Apache Flink with multiple Kafka sources. Ensure one topic is fully read before consuming data on the other topic
Flink user defined sink connector can not serialize data into JSON format
Using Spring with Apache Flink - Command line arguments are not available to Spring
Is there any chance to limit database sessions using jdbc sinks with apache flink?
Flink GlobalWindow Trigger only process the trigger event
Why does Flink Table with Kafka Connector not return results for window-based aggregation operations?
Dependency management and execution environment in apache flink
The POJO class passes the test ,but shows invalid during execution
Flink KeyedProcessFunction Creation Count
Apache Flink Python Datastream API sink to Parquet
Unable to use s3-fs-hadoop plugin in Kubernetes
Build a JSON_Object value in Flink SQL
Kafka Migration with MM2 and Flink: How to Handle Offset Changes and Savepoints?
Performance difference between Table- and DataStream-API
Apache Flink: restoring state from checkpoint with changes Kafka topic