Search code examples
amazon-s3backupamazon-dynamodbelastic-map-reduce

Backup AWS Dynamodb to S3


It has been suggested on Amazon docs http://aws.amazon.com/dynamodb/ among other places, that you can backup your dynamodb tables using Elastic Map Reduce,
I have a general understanding of how this could work but I couldn't find any guides or tutorials on this,

So my question is how can I automate dynamodb backups (using EMR)?

So far, I think I need to create a "streaming" job with a map function that reads the data from dynamodb and a reduce that writes it to S3 and I believe these could be written in Python (or java or a few other languages).

Any comments, clarifications, code samples, corrections are appreciated.


Solution

  • With introduction of AWS Data Pipeline, with a ready made template for dynamodb to S3 backup, the easiest way is to schedule a back up in the Data Pipeline [link],

    In case you have special needs (data transformation, very fine grain control ...) consider the answer by @greg