Search code examples
jdbcapache-kafkadebezium

Debezium / JDBC and Kafka topic retention


I have Debezium in a container, capturing all changes of PostgeSQL database records. In addition a have a Kafka container to store the topic messages. At last I have a JDBC container to write all changes to another database.

These three containers are working as expected, performing snapshots of the old data in the database on specific tables and streaming new changes while there are reflected into the destination database.

I have figure out that during this streaming the PostgreSQL WAL is increasing, to overcome this situation I enabled the following property on the source connector to clear all retrieved logs.

"heartbeat.interval.ms": 1000

Now the PostgreSQL WAL file is getting cleared in every heartbeat as the retrieved changed as flushed. But meanwhile even the changes are committed into the secondary database the kafka topics are remaining with the exact size.

Is there any way or property into sink connector that will force kafka to delete commited messages?


Solution

  • Consumers have no control over topic retention.

    You may edit the topic config directly to reduce the retention time, but then your consumer must read the data within that time.