I have a flink job with a global window and custom process.
The Process is failed after ~10 minutes on the next error:
java.io.InterruptedIOException
This is my job:
SingleOutputStreamOperator<CustomEntry> result = stream
.keyBy(r -> r.getId())
.window(GlobalWindows.create())
.trigger(new CustomTriggeringFunction())
.process(new CustomProcessingFunction());
The CustomProcessingFunction is run for a long time (more then 10 minutes), but after 10 minutes, the process is stoped and failed on InterruptedIOException
Is it possible t increase the timeout of flink job?
From Flink's point of view, that's an unreasonably long period of time for a user function to run. What is this window process function doing that takes more than 10 minutes? Perhaps you can restructure this to use the async i/o operator instead, so you aren't completely blocking the pipeline.
That said, 10 minutes is the default checkpoint timeout interval, and you're preventing checkpoints from being able to complete while this function is running. So you could experiment with increasing execution.checkpointing.timeout
.
If the job is failing because checkpoints are timing out, that will help. Or you could increase execution.checkpointing.tolerable-failed-checkpoints
from its default (0).