Search code examples
mysqlmariadbdebezium

Debezium MySQL (MariaDB) - wrong message counts


I'm exporting a database with debezium, previously I tested this setup and it worked correctly (with about 1% of production data), but in a production setting I'm getting mismatch between row counts in database and the counts of messages that debezium exported.

E.g. I have a table db.large which has ~259 million entries, but debezium only exported 200 million. For some other tables, I'm getting more messages exported by debezium than actually present in the table (this is just during initial snapshot). For a small table with just 542 entries, the counts match.

I see some Failed to flush and Failed to commit offsets messages in logs, but they do not occur for all offset flushes - some are successful. Could these flush/commit failures be the reason for mismatch?

I'm using MySQL connector with debezium 1.7.

Here are partial logs demonstrating the mismatch:

INFO   ||  WorkerSourceTask{id=connector-v1-0} flushing 5722 outstanding messages for offset commit
ERROR  ||  WorkerSourceTask{id=connector-v1-0} Failed to flush, timed out while waiting for producer to flush outstanding 211 messages
ERROR  ||  WorkerSourceTask{id=connector-v1-0} Failed to commit offsets
INFO   MySQL|connector_v1|snapshot       Exported 201944873 of 259000000 records for table 'db.large' after 10:09:38.853
INFO   MySQL|connector_v1|snapshot       Exported 202002217 of 259000000 records for table 'db.large' after 10:09:49.062
INFO   MySQL|connector_v1|snapshot       Exported 202057513 of 259000000 records for table 'db.large' after 10:09:59.281
INFO   MySQL|connector_v1|snapshot       Exported 202112809 of 259000000 records for table 'db.large' after 10:10:09.488
INFO   MySQL|connector_v1|snapshot       Exported 202168105 of 259000000 records for table 'db.large' after 10:10:19.669
INFO   MySQL|connector_v1|snapshot       Exported 202221353 of 259000000 records for table 'db.large' after 10:10:30.152
INFO   ||  WorkerSourceTask{id=connector-v1-0} flushing 5788 outstanding messages for offset commit
INFO   MySQL|connector_v1|snapshot       Exported 202278697 of 259000000 records for table 'db.large' after 10:10:40.334
ERROR  ||  WorkerSourceTask{id=connector-v1-0} Failed to flush, timed out while waiting for producer to flush outstanding 561 messages
ERROR  ||  WorkerSourceTask{id=connector-v1-0} Failed to commit offsets
INFO   MySQL|connector_v1|snapshot       Exported 202336041 of 259000000 records for table 'db.large' after 10:10:50.352
INFO   MySQL|connector_v1|snapshot       Finished exporting 202353026 records for table 'db.large'; total duration '10:10:53.191'
INFO   MySQL|connector_v1|snapshot  Exporting data from table 'db.small' (2 of 7 tables)
INFO   MySQL|connector_v1|snapshot       For table 'db.small' using select statement: 'SELECT `field1`, `field2`, `field3` FROM `db`.`small`'
INFO   MySQL|connector_v1|snapshot       Finished exporting 500 records for table 'db.small'; total duration '00:00:00.021'
INFO   MySQL|connector_v1|snapshot  Exporting data from table 'db.medium' (3 of 7 tables)
INFO   MySQL|connector_v1|snapshot       For table 'db.medium' using select statement: 'SELECT `field1`, `field2`, `field3`  FROM `db`.`medium`'
INFO   MySQL|connector_v1|snapshot       Exported 84873 of 14000000 records for table 'db.medium' after 00:00:10.006
INFO   MySQL|connector_v1|snapshot       Exported 170889 of 14000000 records for table 'db.medium' after 00:00:20.172
INFO   MySQL|connector_v1|snapshot       Exported 258953 of 14000000 records for table 'db.medium' after 00:00:30.267
INFO   MySQL|connector_v1|snapshot       Exported 349065 of 14000000 records for table 'db.medium' after 00:00:40.392

Any thoughts? Thanks


Solution

  • Figured this out - the number of messages exported was actually correct.

    The answer is that debezium does not use actual message count in those logs, but an estimated count: https://github.com/debezium/debezium/blob/8d71080a9a8aac875e338964af417dc8de93dfcc/debezium-connector-mysql/src/main/java/io/debezium/connector/mysql/MySqlConnection.java#L427