Using Debezium 0.7 to read from MySQL but getting flush timeout and OutOfMemoryError errors in the initial snapshot phase. Looking at the logs below it seems like the connector is trying to write too many messages in one go:
WorkerSourceTask{id=accounts-connector-0} flushing 143706 outstanding messages for offset commit [org.apache.kafka.connect.runtime.WorkerSourceTask]
WorkerSourceTask{id=accounts-connector-0} Committing offsets [org.apache.kafka.connect.runtime.WorkerSourceTask]
Exception in thread "RMI TCP Connection(idle)" java.lang.OutOfMemoryError: Java heap space
WorkerSourceTask{id=accounts-connector-0} Failed to flush, timed out while waiting for producer to flush outstanding 143706 messages [org.apache.kafka.connect.runtime.WorkerSourceTask]
Wonder what the correct settings are http://debezium.io/docs/connectors/mysql/#connector-properties for sizeable databases (>50GB). I didn't have this issue with smaller databases. Simply increasing the timeout doesn't seem like a good strategy. I'm currently using the default connector settings.
Changed the settings as suggested below and it fixed the problem:
OFFSET_FLUSH_TIMEOUT_MS: 60000 # default 5000
OFFSET_FLUSH_INTERVAL_MS: 15000 # default 60000
MAX_BATCH_SIZE: 32768 # default 2048
MAX_QUEUE_SIZE: 131072 # default 8192
HEAP_OPTS: '-Xms2g -Xmx2g' # default '-Xms1g -Xmx1g'
This is a very complex question - first of all, the default memory settings for Debezium Docker images are quite low so if you are using them it might be necessary to increase them.
Next, there are multiple factors at play. I recommend to do follwoing steps.
max.batch.size
and max.queue.size
- reduces number of commitsoffset.flush.timeout.ms
- gives Connect time to process accumulated recordsoffset.flush.interval.ms
- should reduce the amount of accumulated offsetsUnfortunately there is an issue KAFKA-6551 lurking in backstage that can still play a havoc.