MongoDB Sharding Key

We have a large MongoDB collection that we'd like to start sharding. The collection has 3.4B records and is ~14.6TB in size (5.3TB compressed on disk). This collection typically sees writes on the order of ~5M per hour, but we expect this to continue to grow year over year. The indexes on this collection are ~220GB in size.

All records have a feedId and all queries will be for records belong to a specific feedId. There are currently ~200 unique feedId values, but the distribution across each value is highly non-linear. On the low end, some feedId's may only see dozens of records per day. On the other hand, the top 5 feedId's make up ~75% of the dataset.

Records also have a timestamp and queries will always be for a given date range. The timestamp field is more-or-less monotonic.

There is already an existing compound index on feedId and timestamp.

The typical working set for this collection is only the last few weeks worth of data, and is therefor only a very small percentage of the actual data. Queries for this data must be very fast, with slower queries for the historical data being acceptable. As such, we're planning to use "tags" and/or "zones" to move older data to nodes with larger HDD's and use nodes with SSD's for the "hot" data.

Based on these factors, is using a shard key of {feedId: 1, timestamp: 1} reasonable? My feeling is that it may lead to "hot" nodes due to the non-linearity of feedId and the monotonic nature of timestamp. Would adding a "hashed" field to the key make it better/worse?

Solution

So lets take this bit by bit!

The collection has 3.4B records and is ~14.6TB in size (5.3TB compressed on disk)

The nature of sharding is such that it's important to get this right the first time through. I'm going to go into more detail here, but the TL;DR is:

Extract a portion of your dataset (e.g. using mongodump --query) to a staging cluster (e.g. using mongorestore)
Point a sample workload at the staging cluster to emulate your production environment
Test one or more shard key combinations. Dump/reload as needed until you're satisfied with performance.

Now, lets dig in:

There are currently ~200 unique feedId values, but the distribution across each value is highly non-linear. On the low end, some feedId's may only see dozens of records per day. On the other hand, the top 5 feedId's make up ~75% of the dataset.

So one field that supports a good chunk of your queries has pretty low frequency. You are definitely likely to see hotspotting if you were just sharding on this field 1

Records also have a timestamp and queries will always be for a given date range. The timestamp field is more-or-less monotonic.

So another field that supports the majority of your queries, but also not great for sharding 2

Records also have a timestamp and queries will always be for a given date range. The timestamp field is more-or-less monotonic.

This to me kind of implies that the primary field you're querying against is time based. For a given period of time give me the documents with the specified feedID. You're also going to get targeted queries, because you're querying on the shard key more often than not (e.g. either on a range of time, or a range of time + the feedId). 3

This also supports your idea for zoning:

As such, we're planning to use "tags" and/or "zones" to move older data to nodes with larger HDD's and use nodes with SSD's for the "hot" data.

With zoning, you can use any key in the shard key, as long as you include the entire prefix leading up to that key. So { feedId: 1, timestamp: 1 } would principally support zones on feedId and timestamp, which isn't quite what you are looking for. 4

Based on that alone, I would venture that { timestamp : 1, feedId : 1 } would be a good selection. What your testing would need to look into is whether adding a low-frequency field to a monotonically increasing field provides good chunk distribution.

Now, as far as hashing:

Would adding a "hashed" field to the key make it better/worse?

If you mean, your documents already have some hashed field, then you could definitely add that just for randomness. But if you're talking about a hashed shard key, then that's a different story. 5

Zones and hashed shard keys don't play together. The nature of the hashed shard key means that the chunk ranges (and therefore zones) represent the hashed shard key values. So even if you have two documents with values that are very near to each other, they are likely to end up on completely different chunks. So creating a zone on a range of hashed shard key values probably wont do what you want it to do. You could do something like using zones with hashed sharding to move the entire collection onto a subset of shards in the cluster, but that's not what you want to do. 6

Now there is one critical issue you might run into - you have a huge collection. Your choice of shard key might cause issues for the initial split where MongoDB attempts to divide your data into chunks. Please take a look at the following section in our documentation: Sharding an Existing Collection. There is a formula there for you to use to estimate the max collection size your shard key can support with the configured chunk size (64MB by default). I'm going to guess that you'll need to increase your chunk size to 128MB or possibly 256MB initially. This is only required for the initial sharding procedure. Afterwards you can reduce chunk size back to defaults and let MongoDB handle the rest.

Mind you, this is going to have a performance impact. You'll have chunks migrating across shards, plus the overhead for the actual chunk split. I would recommend you post to our Google Group for more specific guidance here.