The requirement - A customer requires an automated mechanism that takes a manual snapshot of an AWS ElasticSearch domain (production) on a daily basis. The target of the snapshot is an AWS S3 bucket.
Expected flow
state==IN_PROGRESS
, check snapshot status again, up to 10 times, interval of 5 mins
state==SUCCESS
- end process (success)state==IN_PROGRESS
- when reaching 10 retries (50 mins), end process (failed)state==FAILED
- end process (failed)Motivation - The automated snapshots that are taken by AWS can be used for disaster recovery or a failure in an upgrade, they cannot be used if someone by accident (yes, it happened) deleted the whole ElasticSearch cluster.
Haven't found an out-of-the-box Lambda/mechanism that meets the requirements. Suggestions? Thoughts?
p.s- I did a POC with AWS Step Functions + Lambda in VPC, which seems to be working, but I'd rather use a managed service or a living open-source project.
In case you accidentally delete your AWS Elasticsearch domain, AWS Support can help you recover the domain along with its latest snapshot on best effort basis. This is not listed in the documentation since this shouldn't ideally be your first bet.
Assuming this will be rare scenario, you should be fine. However, if you think there are fair chances of your AWS ES cluster getting delete again and again, you will be better off setting up a lambda function to save a latest snapshot in your own S3 bucket. This will save you from depending on AWS support as well.