Search code examples
mysqlgoogle-cloud-platformgoogle-bigqueryreplicationgoogle-cloud-data-fusion

How to make GCP Data Fusion MySQL Replication work well with DateTime columns


I managed to have MySQL tables replicated into BigQuery fairly easily by following this article on Cloud Data Fusion Replication. However, there's an issue with the DateTime columns. All the DateTime columns have been replicated into BigQuery using a 1970's date. Does anyone know how to fix this?

Here is the original MySQL data:
MySQL table And here's the replicated data in BigQuery BigQuery table


Solution

  • I figured another way. You can simulate MySQL replication into BigQuery by making your own batch pipeline, then schedule that pipeline to run at the frequency you want. The MySQL setup is easy to do. Just follow the instructions to install the MySQL driver here. Then you setup your MySQL data source and your BigQuery Sink. The DateTime columns in MySQL should be marked as TimeStamps and their corresponding columns in BigQuery must be of type DateTime.

    MySQL Data Source Configs enter image description here

    BigQuery Sink Configs enter image description here

    Finally, you can make a BigQuery Execution Action before the MySQL Source to fetch the id or time of the latest record you have replicated.

    enter image description here