Search code examples
apache-kafkaapache-kafka-connectdebezium

How to get database.server.name for Kafka Debezium MySQL connector?


EDITING the question:

Trying to configure a debezium MySQL Kafka connector, taking as example

https://debezium.io/documentation/reference/stable/connectors/mysql.html#mysql-example-configuration

I have:

  • hostname: "ec2-xxx.compute.amazonaws.com"
  • database: mycooldb (with all my tables inside)

Then I set the following properties like:

"database.hostname": "ec2-xxx.compute.amazonaws.com"
"database.include.list": "mycooldb"

And debezium has another property called "database.server.name". How can I find the server name value in MySql server?

A server can have multiple database, then in database.include.list I can include a list of databases.

database.hostname is the for the hostname or the ip.

I'm not sure about what's database.server.name and how to get the value from MySQL server?. In the scenario if I want to include multiple datbases in database.include.list, then what's the value for atabase.server.name?


Solution

  • What is the difference between database.server.name and database.hostname

    Per the docs:

    • database.hostname: IP address or host name of the MySQL database server
    • database.server.name: Logical name that identifies and provides a namespace for the particular MySQL database server/cluster in which Debezium is capturing changes. The logical name should be unique across all other connectors, since it is used as a prefix for all Kafka topic names that receive events emitted by this connector. Only alphanumeric characters, hyphens, dots and underscores must be used in the database server logical name.

    So database.hostname must be the host/IP of where the database can be found. database.server.name could be fred or foobar or sales or anythingelse. It's just a logical name for that database, and is used (as described above) in the Kafka topic.

    Without database.server.name you'd have the potential problem ingesting a table called foo from two different databases using two different Debezium connectors and both trying to store it in a Kafka topic called foo. Hence the comment in the docs that database.server.name "…provides a namespace"


    Edit: In regards to your comments, my answer is still accurate. The docs detail topic naming, specifically the fact that the MySQL database name is used in part of the topic as is the database.server.name. If you connect to the same MySQL host (let's say we call it database.server.name=fred), and pull data from two databases on it called sales and warehouse, and each has a table called audit, you'd have two resulting Kafka topics:

    • fred.sales.audit
    • fred.warehouse.audit