I'm wanting to use Debezium with kafka connect, but I need to do the work at the end of a transaction. The transaction updates many tables. I can get the GTID for each message but how do I know when I've received all messages for that transaction? I cannot start processing until I know everything is updated.
Debezium does not indicate the tx completion right now. The plan for future is to include a squence number for each change in the transaction and indicate which record is the last one. It should be then easy to aggregate the changes back to a single message.
As a stopgap solution you can use Kafka Streams session windows where session identifier will be the transaction id (GTID).