Search code examples
javaapache-flinkflink-streaming

flink taskmanager failing with OOM


have a flink job listening to kafka topic, all the job does is listen to kafka and insert the message to elastic search. The job keeps failing with one of the task managers going OOM.

I am using flink docker in kubernetes with 3gb memory. But i see failures in taskmanager logs . I am not using any special features just listen and write to elastic.

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "I/O dispatcher 2945"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-1170-thread-1"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "I/O dispatcher 5441"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "I/O dispatcher 2467"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-3944-thread-1"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-2651-thread-1"

Solution

  • @Thanks Arvid, we did dig through the logs and found ES is leaking a connection code issue after fixing it and using rocksDB as backend state we are running smooth with incremental check pointing turned ON.