I have a use case where my model has couple of attributes. Each model can be uniquely identified by an OrderId and a Version. The no.of versions for 1 OrderId is not much, on an average 2-3 versions per OrderId. I have 2 access patterns only:
My understanding is that I have 2 options to model this in DynamoDB
Note: Let's assume that I will be able to get the same cardinality in both options.
Based on some reading, this is what I know now.
In approach 1, when using a GetItem API, DynamoDB uses an internal hash using PartitionKey to figure out the partition where the document resides. However, At this point, I'm not really sure how it then gets the actual document using the SortKey. Does it do a binary search or some other algorithm or some indexing?
In approach 2, when using a GetItem API, DynamoDB uses an internal hash using the concatenated string and finds the partition. Now, how does dynamo find the actual document in that partition?
I believe you're asking many questions which really have no relevance to what it is you really want to know, which is, should I use a sort-key or not.
Delete all orders of an OrderId which has Version lesser than an input version
Based on this necessity, you should use a sort key. Let's imagine you are unsure how many versions you have, but you know you want to delete all version older than N. You would not be able to find those without knowing them in advance if you used a composite partition key. But if you use version as the sort key, you can say give me the first 5 versions for this orderId
etc.... without the need to know the versions.
Using a sort-key provides you some query flexibility which can be very useful, so long as the partition key defined is of high cardinality and won't be written to more than 1000 times per second (assuming item size <1KB).
To answer your question about time complexity, DynamoDB stores your data in partitions, items are stored based on their sort-key if defined. The search algorithm works on a B-tree, DynamoDB first locates the necessary partition and uses a B-tree to locate the location of the items within that partition.