I have a cluster of 4 EC2 Elasticsearch (version 7.7.1) nodes on AWS (including a master). A Snapshot (incremental) of all indices is made daily.
This cluster being built on type "I" ec2 for good read/write performance , the storage is volatile : in case of cluster crash, all data is gone. That's not something I can change at this moment.
I simulated a cluster crash by terminating my instances, and rebuilding my cluster before I restore my daily snapshots in it.
I can't find a way to accelerate the restoration of snapshots on the new cluster. The client I work for wants 14 days of data on the cluster. Snapshots being daily incremental, that means I have to restore 14 snapshots to recover all my data, one day at a time. I can't make ONE full restoration.
At each restoration, the data is being rebalanced between nodes according to the replication policy (1:1). I have to wait quite a long time between my restoration and the cluster status being green. For 500Go of data (which is nothing compared to the data we estimate in the future) : it took me more than 2 hours to restore all my snapshots.
During this process, all indices must be closed, but restoring a snapshot open some indices, so I have to close them all before restoring each snapshot... Kibana and Logstash are sending/listening to the cluster, so I have to stop them in order to restore my snapshots quietly.
Is there any way to improve this ? I can't find a way to restore multiple snapshots at once. Should I stop the rebalancing while the restoration is being made ?
I'm surprised I can't find anything on that, I must've missed something big :/ Any ideas, or experience feedback on that ? Thanks a lot !
Snapshots being daily incremental, that means I have to restore 14 snapshots to recover all my data, one day at a time.
This statement is not correct! All you have to do is to restore the latest snapshot and that will restore all of your data.
Being incremental means that each snapshot contains only the data created since the last snapshot, but it also links to older data contained in older snapshots, but you don't have to care about that, just restore the latest one and you'll see you'll have all your data.