I am looking for way to improve the following query scans. I need to query based on 3 keys
DynamoDB only allows 2 conditions in key-condition-expression. I have to use filter-expression which is scanning too many records.
I am also considering combining 2 keys as sort key in GSI as an alternative. Is this the right way to do this?
aws dynamodb query \
--table-name bundles \
--index-name GSI-RegulationSidBundleStatus \
--key-condition-expression "RegulationSid = :regulationSid AND BundleStatus = :bundleStatus" \
--filter-expression "AccountSid = :accountSid" \
--expression-attribute-values '{
":accountSid": {
"S": "XXXXXXXXXXXXXXXXXXXXXXXXXXX"
},
":regulationSid": {
"S": "YYYYYYYYYYYYYYYYYYYYYYYYYYY"
},
":bundleStatus": {
"S": "APPROVED"
}
}'
Yes, adding multiple keys into the partition key or sort key is a common pattern. To help with identifying keys, it is common to prefix each key with the key type, or an abbreviation, followed by a hash, and a hash between each key.
For your case, a sort key would look similar to:
r#${RegulationSid}#b${BundleStatus}
.
This is also common for partition keys, even when you have just one key. In your case: a#${AccountSid}
.
When deciding on whether to overload the partition key or sort key, and the order of the keys, look at your access patterns. If you know you have a pattern to get all regulations, you want to put this key first. Or, if you know you need to get all regulations for a given bundle status, put the bundle status first. Then you can use a begins_with
query to get these lists of items, reusing the same GSI.
In your code documentation, you can list the used abbreviations to make sure you don't use the same abbreviation for multiple key types.
I suggest always keeping the customer account Id as the first key in the partition key. If you later decide to use Leading Key row-level authorisation this will come in handy.