Search code examples
amazon-web-servicesamazon-dynamodbaws-sdkdynamodb-queriesaws-sdk-nodejs

DynamoDb: How to retrieve the first item (by sort key) for each of a given list of partition keys


I have a dynamodb table that stores historical run data for processes that run on my server, I need a place where I can aggregate these processes and see the data for the latest of each of these. Each process has it's own ProcessId which is the partition key for the dynamodb table. The sort key is the StartDateTime

{
  ProcessId, // Partition Key
  StartDateTime, // Sort Key
  ... // More data
}

Essentially I need to retrieve the most recent StartDateTime for each ProcessId that I give. I'm using a nodejs lambda with the aws-sdk to retrieve the data. I've looked into using BatchGetItem but my understanding is that for tables with a Partition Key and Sort Key, you need to provide both to retrieve an item. I've also looked into using a Query, but I would need to run a separate query for each Partition which is less than Ideal. Does anyone know of a way I can make this request in one call rather than having to make a separate call per Partition?


Solution

  • You appear to be trying a sort of aggregation, and DynamoDB is typically not best suited for aggregations, but more for CRUD style operations.

    Instead of running expensive queries or scans, try enabling DynamoDB Streams on the table, and using another lambda to 'upsert' the start time in another DynamoDB table with the processId as the partition key.

    Then you can run your query for the latest start time on the processId on this new table.