Search code examples
amazon-web-servicesamazon-redshiftamazon-kinesisamazon-auroraamazon-kinesis-firehose

Sync data from Amazon Aurora to Redshift


I am trying to setup a sync between AWS Aurora and Redshift. What is the best way to achieve this sync?

Possible ways to sync can be: -

  1. Query table to find changes in a table(since I am only doing inserts, updates don't matter), export these changes to a flat file in S3 bucket and use Redshift copy command to insert into Redshift.

  2. Use python publisher and Boto3 to publish changes into a Kinesis stream and then consume this stream in Firehose from where I can copy directly into Redshift.

  3. Use Kinesis Agent to detect changes in binlog (Is it possible to detect changes int binlog using Kinesis Agent) and publish it to Firehose and from there copy into Firehose.

I haven't explored AWS Datapipeline yet.


Solution

  • As pointed out by @Mark B, the AWS Database Migration Service can migrate data between databases. This can be done as a one-off exercise, or it can run continuously, keeping two databases in sync.

    The documentation shows that Amazon Aurora can be a source and Amazon Redshift can be a target.