Search code examples
amazon-dynamodbdynamodb-queriesamazon-dynamodb-index

How to query 100 first items ordered by sort key of a DynamoDB table?


I have a simple DynamoDB table called Scores with 3 attributes:

| id                                   | score | username |
|--------------------------------------|-------|----------|
| ed6dbe2f-ffb3-44f3-a5f1-bb84502e5400 | 300   | bob      |
| b8f91dd9-09e6-466b-88f3-9e7b95dc9a3a | 100   | peter    |
| 55dd0bd1-2bc7-4bb7-a988-20e506c2c35c | 500   | bob      |

my key attributes are hash key = id and sort key = score

I want to query the first 100 smallest scores, with the username of the player who scored it, ordered asc.

in SQL, in would be:

SELECT * FROM Scores ORDER BY score LIMIT 100;

I tried and failed with these tactics:

  • use a scan => it returns the items unordered.
  • use a query => I don't want to target a specific primary key value
  • use a global secondary index with hash key = username and sort key = score => I can't query only with the sort key, I have to specify the hash key as well

so my poor solution was to perform a scan, then sort and slice in memory... sigh

I'm convinced that a similar problem has already been discussed and solved somewhere, but I already read a lot and I can't find anything. Can you give me a hint please ?


Solution

  • Create a GSI with a singular PK value and the timestamp as the SK. Then you can do a Query for the first 1,000 passing a limit of 100.

    This works, but only up to 1,000 write units, because that singular PK value gets hot. The solution there is to shard. Decide how many write units you need. 10,000? OK so instead of one PK value, use 10. Each item gets randomly assigned a shard number as its GSI PK attribute.

    Your query call will have to run 10 queries (client can do in parallel) and coalesce the results client side. Works for any scale of write traffic.

    It’s a technique discussed in https://youtu.be/0iGR8GnIItQ (at 33:44).