Search code examples
nosqlazure-cosmosdbpartitioningdatabase-partitioningazure-cosmosdb-sqlapi

Read Model Partition Key Strategy


I have a collection of documents that looks like the following:

enter image description here

There is one document per VIN/SiteID and our access pattern is showing all documents at a specific site. I see two potential partition keys we could choose from:

  1. SiteID - We only have 75 sites so the cardinality is not very high. Also, the doucments are not very big so the 10GB limit is probably OK.
  2. SiteID/VIN: The data is now more evenly distributed but now that means each logical partition will only store one item. is this an anti-pattern? also, so support our access pattern we will need to use a cross-partition query. again, the data set is small so is this a problem?

Based on what I am describing, which partition key makes more sense?

Any other suggestions would be greatly appreciated!


Solution

  • Your first option makes a lot of sense and could be a good partition key but the words "probably OK" don't really breed confidence. Remember, the only way to change the partition key is to migrate to a new collection. If you can take that risk then SiteId (which I'm guessing you will always have) is a good partition key.

    If you have both VIN and SiteId when you are doing the reading or querying then this is the safer combination. There is no problem with having each logical partition to store one item per se. It's only a problem when you are doing cross partition queries. If you know both VIN and SiteId in your queries then it's a great plan.

    You also have to remember that your RUs are evenly split between your partitions inside a collection.