I am evaluating using Kafka to sync on-premise and cloud DBs for our company project, and got a strange effect, that the new data are fetched to topic with a jdbc data source only 2 hours after really created/modified. Following occurs:
Existing software writes a data entry in on-premises Oracle DB. The column LAST_CHANGED_AT is set to the time true to the customer (say, 17.10.23 15:52:48,352053). Column type is TIMESTAMP. I cant influence this process or data.
JdbcSourceConnector which monitors the field and polls every second is running in a corporate Kafka instance. It has local time shifted by two hours. So at the moment of creation of entry, local time at server is 13:52:48. The data are not fetched until it is 15:52:49, so apparently the JdbcSourceConnector is ignoring the timestamps which are in the future.
As we need data synced to cloud with max. 1-2 seconds latency, such delay is unacceptable. Being a newbee on Kafka, I was not able to find a solution yet. So, could anyone help how to resolve this situation?
I fixed the issue following way: Kafka server was UTC, and Database was writing CET entries. So, to solve the issue I used db.timezone: "Europe/Copenhagen". Now the delay is in average 0.5 seconds, which is absolutely fine and correlates to 1 second polling interval.