Search code examples
amazon-web-servicesamazon-s3amazon-kinesisamazon-kinesis-firehose

AWS: Setting up a kinesis stream from PostgreSQL to Redshift


In reference to my previous question, I got my boss to go ahead and let me set up a DMS from my existing postgres to our new redshift db for our analytics team.

The next issue that I am having, and after spending 3 days doing searching on this has provided nothing to help me with this. My boss wants to use Kinesis to pull real-time data from our PG db to our RS db so our analytics team can pull data in real time from it. I'm trying to get this configured and I'm running into nothing but headaches.

I have a Stream set up, Firehose set up to grab from our S3 bucket that I created called "postgres-stream-bucket", but I'm not sure how to get data to dump into it from PG, and then making sure that RS picks everything up and uses it, in real time.

However, if there are better options I would love to hear them, but it is imperative that we have real-time (or as close as possible) translated data.


Solution

  • Amazon Kinesis Firehose is ideal if you have streaming data coming into your systems. It will collect the records, batch them and load them into Redshift. However, it is not an ideal solution for what you have described, where your source is a database rather than random streams of data.

    Since you already have the Database Migration Service setup, you can continue to use it for continuous data replication between PostgreSQL and Redshift. This would be the simplest and most effective solution.