I'm learning about AWS DynamoDB and something confuses me. If one write to a DynamoDB table implies one write to all of its GSIs (because there are several copies of the data to power the different partitioning), then why would you ever pick a different write capacity for the "main" table and the GSIs ? Why do we even have the option to ?
Here are a couple of reasons why demand for GSI write capacity could be lower than the table's capacity:
(1) You apply the sparse index pattern, in which not all table items are written to an index:
For any item in a table, DynamoDB writes a corresponding index entry only if the index sort key value is present in the item. If the sort key doesn't appear in every table item, or if the index partition key is not present in the item, the index is said to be sparse.
(2) Your index query patterns don't require the whole item. If so, you can write ("project") only a subset of an item's attributes to the index. Choosing projections carefully could potentially reduce demand for index writes:
Because secondary indexes consume storage and provisioned throughput, you should keep the size of the index as small as possible ... If your queries usually return only a small subset of attributes, and the total size of those attributes is much smaller than the whole item, project only the attributes that you regularly request.
Both approaches feature in the DynamoDB best practices guide.
What about the opposite case? @hunterhacker noted a scenario where a GSI might conceivably need *more* capacity than the table: if you have lots of update operations that modify a GSI primary key attribute but NOT the table key attributes. That would need a delete + insert (2 operations) on the index, but only an update (1 operation) on the table.