Search code examples
cloudant

how to backup Cloudant to mass low cost storage such as AWS Glacier?


One approach organisations sometimes use for backing up Cloudant is to run a standalone instance of CouchDB in their private network or a public network and replicate data from Cloudant to that CouchDB instance. The CouchDB data can then be exported to mass storage such as Amazon Glacier.

Questions:

  • What are the steps required to implement this?
  • Are there any gotchas to be aware of?

Solution

  • Here are the approximate steps:

    • a server running CouchDB (e.g. in EC2)
    • continuous replication from Cloudant --> CouchDB
    • periodic (e.g. nightly) cron job to
      • copy the relevant .couch file over somewhere
      • zip it up
      • use AWS command-line tools to put the zipped file on S3
      • use AWS command-line tools to send that S3 file to Glacier

    Things to remember:

    • Glacier keeps everything unless you say "kill that backup from 30 days ago", so you keep paying for old backups. Best to delete really old stuff
    • with continuous replication: if you delete a doc on Cloudant it immediately deletes on your backup (oops)
    • restoring from Glacier is a pain, then you can restore it to CouchDB, then you can replicate it to Cloudant.
    • Cloudant will not be able to support your CouchDB installation - you will need to support it yourself.