amazon-web-services elasticsearch aws-elasticsearch

Elasticsearch mega-cluster vs smaller clusters when you have relatively small amounts of data

At the moment we have 3 separate environments, with 4 Elasticsearch clusters. More over we have 2 different use cases, whereby we search through customer data for similarities, and we have logs which we search through.

Reading the elasticsearch documentation and online video discussions it is recommended to optimise your cluster for your use case, so technically speaking we should have then 4 x 2 = 8 separate clusters. Some clusters could be grouped together on the basis of being production and non-production clusters. But really we are a small team, < 10 people, there isn't allot of data in most of these clusters, so it is too expensive to run 3 x master nodes in all of them.

Even though Elastic seem to recommend to have one cluster per use case so you can optimise your index & shard size IMO I believe we would be getting much better performance and stability if we only had the 1 cluster, or at least went down to 2 (prod & nonprod). We'd also have less maintenance overheads I'm getting CloudWatch alarms in every environment, there are saved objects which are needing to be transferred from on environment to another, and I have become the person to mange all this which ends up creating allot of dead effort.

So my question is even though clusters should be tuned per use-case for best practice, Does this still make sense when your data is only a couple of GiB? Given the overhead in management, and the stability trade-offs with micro-clusters that don't have master nodes?

Solution

ES clusters are made of nodes(data and master nodes most importantly) and its not like that you can optimize only at cluster level, you can easily fine tune the index and shards level based on your use-case.

As in your case you don't have much data it makes sense to have just 2 env(prod and non-prod env.) to avoid the cluster management overhead and to apply the best practice like having 3 master nodes for high availability reasons.

when it comes to optimize based on what data and use-case your indices are severing, you can optimize that, for example for read-heavy indices it makes sense to have more replicas and for index heavy operations, you might want to increase the refresh_interval(def 1 sec) to higher value.