Search code examples
mongodbcluster-computingsharding

How to handle different server types in MongoDB sharded cluster


Is there a way to deal with different server types in a sharded cluster? According to MongoDB documentation the balancer attempts to achieve an even distribution of chunks across all shards in the cluster. So, it purely seems to be based on the amount of data.

However, when you add new servers to an existing sharded cluster then typically the new server has more disc space, disc is faster and CPU has more power. Especially when you run an application for several years then this condition might come a fact.

Does the balancer take such topics into account or do you have to ensure that all servers in a sharded cluster have similar performance and resources?


Solution

  • You are correct that the balancer would assume that all parts of the cluster is of similar hardware. However you can use zone sharding to custom tailor the behaviour of the balancer.

    To quote from the zone sharding docs page:

    In sharded clusters, you can create zones of sharded data based on the shard key. You can associate each zone with one or more shards in the cluster. A shard can associate with any number of zones. In a balanced cluster, MongoDB migrates chunks covered by a zone only to those shards associated with the zone.

    Using zones, you can specify data distribution to be by location, by hardware spec, by application/customer, and others.

    To directly answer your question, the use case you'll be most interested in would be Tiered Hardware for Varying SLA or SLO. Please see the link for a tutorial on how to achieve this.

    Note that defining the zones is a design decision on your part, and there is currently no automated way for the server to do this for you.

    Small note: the balancer balances the cluster purely using the shard key. It doesn't take into account the amount of data at all. Thus in an improperly designed shard key, it is possible to have some shard overflowing with data while others are completely empty. In a pathological mis-design case, some chunks are not divisible, leading to a situation where the cluster is forever unbalanced until an extensive redesign is done.