Search code examples
apache-flinkflink-streaming

Flink StreamSink and Checkpoint Understanding


I have written a job where 5 different source and sink is there in a single application. i am writing the data in parquet format using stream sink. As parquet sink write data on checkpoint. If one of the source get some malform records than i am getting exception in sink. But that causing my all the consumer to getting stopped. I am not able to write any data by other sinks also.

Example:

source1(kafka)---sink1(s3) source2(kafka) -sink2(s3) source3(kafka) - sink3(s3)

i need to understand why due to one sink getting failed causing all the consumer to get stopped and no data is getting write in S3. can somebody please help to understand this or i am missing something.


Solution

  • The application needs to fail or otherwise orderness and consistency guarantees cannot hold anymore. This is completely independent of checkpointing.

    If just one task fails, all other tasks in one application need to fail as well as Flink cannot know which tasks are relevant or not.

    In your case, you actually seem to have 3 independent applications. So you have three options:

    If they should fail together, you put them all in the same StreamExecutionEnvironment as you have done.

    If all applications should run independently, you need to start the job 3 times with different parameters. The three deployments can be then restarted independently.

    If you would still like to deploy only once, then you could spawn 3 StreamExecutionEnvironments and let them run in parallel in different threads. The main should then join on these threads.