I'm using the Debezium connector version 0.8 to capture the changes from a MySQL database and move it to Kafka. I'm using Docker with a container for MySQL, another one for the connector, and another one for Kafka.
When I stop Docker (docker-compose down
) and I start Docker one more time, I usually get the following error:
org.apache.kafka.connect.errors.ConnectException: The db history topic is missing. You may attempt to recover it by reconfiguring the connector to SCHEMA_ONLY_RECOVERY
I have read the solution for this issue on the official page here:
https://debezium.io/blog/2018/03/16/note-on-database-history-topic-configuration/
But I followed those steps and I think my configuration is ok:
log.retention.bytes = -1
log.retention.hours = 168
log.retention.minutes = null
log.retention.ms = -1
Note that if I set log.retention.ms
to -1 then log.retention.minutes
and log.retention.hours
won't be used like the official documentation explains, and then I have solved the retention size and retention time problems.
So, does anybody know why I'm getting this error?
This is a part of university work. I think I cannot share the complete docker-compose
file before I publish it at my university, but I can show you the important things related to this problem. I don't think this is a configuration problem because I have nothing special in my docker-compose
.
mysql:
image: mysql/5.7:configured (Little changes like enabling queries...)
environment:
- MYSQL_ROOT_PASSWORD=debezium
- MYSQL_USER=mysqluser
- MYSQL_PASSWORD=mysqlpw
volumes:
- "sql_Data:/var/lib/mysql"
- "sql_LogError:/var/log/mysql"
kafka:
image: debezium/kafka:0.8
depends_on:
- zookeeper
environment:
- HOST_NAME=xxxx
- ADVERTISED_HOST_NAME=xxxx
- ZOOKEEPER_CONNECT=zookeeper:2181
- KAFKA_CREATE_TOPICS="events:1:1"
- KAFKA_LOG_RETENTION_MS=-1
volumes:
- "kafka_Data:/kafka/data"
- "kafka_Log:/kafka/logs"
- "kafka_Conf:/kafka/config"
connect:
image: debezium/connect:0.8
depends_on:
- zookeeper
- kafka
- mysql
environment:
- HOST_NAME=xxxx
- ADVERTISED_HOST_NAME=xxxx
- BOOTSTRAP_SERVERS=xxxx:9092
- GROUP_ID=1
- CONFIG_STORAGE_TOPIC=my_connect_configs
- OFFSET_STORAGE_TOPIC=my_connect_offsets
- STATUS_STORAGE_TOPIC=my_connect_statuses
volumes:
sql_Data:
sql_LogError:
kafka_Data:
kafka_Log:
kafka_Conf:
And the other parts are only networks or not relevant things.
Finally, after struggling with this problem during a lot of days I found the cause of the problem and the solution.
There is an errata in the documentation of the debezium/zookeeper
image. As you can see in this link:
link to debezium/zookeeper image in dockerHub
The documentation establishes 3 volumes to save all the data zookeeper needs. The paths to these volumes are:
/zookeeper/data
/zookeeper/logs
/zookeeper/conf
The problem here is the second one is wrong. According to its Dockerfile
, the path to the second one, which is used to save the transaction log, must be:
/zookeeper/txns
Here is a snippet of its Dockerfile
.
# Expose the ports and set up volumes for the data, transaction log, and configuration
EXPOSE 2181 2888 3888
VOLUME ["/zookeeper/data","/zookeeper/txns","/zookeeper/conf"]