Search code examples
amazon-web-servicesamazon-s3loading

Automate Unloading redshift data ,other files from drives,web api to s3 bucket


I have some data to be ingested from redshift to snowflake.currently I use unload command to get it to s3 bucket and then load it to snowflake for redshift and aws cp for other files. I am looking for someway to automate this unloading .Looking for suggestions .I also have a use case to load the delta of these data so will have to run it on a schedule. Any suggestions would be appreciated


Solution

  • UPDATE DataPipeline is deprecated: https://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/migration.html

    You could use a lambda if you are confident that the execution time will be less than 15 minutes Otherwise you could use a data pipeline. You can schedule ec2 instances to execute your process. You could write a bash script that to do what you require. Unload data from redshift etc. Or you could use an existing template and modify it visually in the "architect" tool in the AWS console

    Create a new pipeline using a template (Full copy of RDS MySql to S3)

    enter image description here

    Configure the pipeline and press Edit in architect

    enter image description here

    You will see a graph similar to the one below. You can easily adjust it to use Redshift instead of RDS (As it's displayed in the screenshot)

    enter image description here