Search code examples
elasticsearchsharding

What are the risks of large shards in Elasticsearch?


At my workplace each of our ES indices is configured to have exactly 5 shards and we make no use of the Rollover API or ILM. Most of our indices are quite small, but we have one large index where each individual shard is close to 250 gigabytes. There is now discussion ingesting additional data that will roughly double the size of that index.

I'm trying to pump the breaks on this because from my understanding of best practices (e.g. those described by Elastic Co. here) shards should ideally be <=50GB. My understanding of the risks involved with letting shards get too big:

  • Degraded search performance
  • More difficult/slower to recover from failure
  • More difficult/slower for ES to perform cluster rebalancing during normal operations

Are these accurate? Are there others risks that I should be aware of? I'm also a bit concerned that as the shards large memory issues could come into play and the entire cluster could become unstable. Is that a well-founded concern?


Solution

    • When a shard becomes too large, it can slow down search queries, as each query has to process a larger amount of data.
    • When a node fails, Elasticsearch needs to reallocate and recover the shards from that node. A large number of shards take a significantly long time to recover, increasing the time the cluster is in a state of reduced redundancy and increased vulnerability.
    • Elasticsearch clusters frequently rebalance shards across nodes to distribute data and load evenly. Larger shards make this process more resource-intensive and time-consuming
    • Large shards can lead to memory pressure, particularly in environments with limited storage.
    • The process of taking snapshots and restoring from them can become more challenging and time-consuming with larger shards
    • In extreme cases, very large shards can increase the risk of data loss. If a large shard becomes corrupted or is lost during a node failure, the amount of data lost can be substantial.

    that's the some bad sides of the large amount of shards. In practice you can face with many other issues. So you should plan shard number correctly beforehand.