Search code examples
amazon-dynamodbamazon-dynamodb-index

dynamodb index does not return all data


I have a table storedgames, which contains 2092 items.

table

And it also has an index on that table, which also lists 2092 items.

user-index

when I fetch data, I use the index, to obtain the items for one specific user.

const params = {
  TableName: "storedgames",
  IndexName: "user-index",
  KeyConditionExpression: "#usr = :usr",
  ExpressionAttributeNames: { "#usr": "user" },
  ExpressionAttributeValues: { ":usr": user }
};

const data = await new Promise((resolve, reject) => {
  docClient.query(params, (err, data) => {
    if (err) { reject(err); } else { resolve(data); }
  });
}).catch((err) => {
  console.error(err);
  return false;
});

However, the above code does not return all items. It only finds 42. And for today's items there is only 1 hit. When I check directly on the AWS webpage, I actually find more items for today.

more items on webpage in table

And even when I do this using the index, it finds more records.

more items on webpage using index

When I leave out the filtering of the day, I actually find over 130 items, while my javascript code only returns 42 items when I leave out the day filter.

130 vs 42 items


So my question is, why does the data of my index seem to be incomplete when I call it programmatically ?


Solution

  • The records actually contain a lot of data, and there appears to be a limit in the amount of data that can be fetched per query.

    A single Query operation can retrieve a maximum of 1 MB of data. This limit applies before any FilterExpression is applied to the results. If LastEvaluatedKey is present in the response and is non-null, you must paginate the result set (see Paginating the Results).

    So, I one possible solution, is to perform multiple fetches until you have the entire collection.

    const queryAllItems = (params, callback) => {
      let fullResult = { Items: [], Count: 0, ScannedCount: 0 };
    
      const queryExecute = (callback) => {
        docClient.query(params, (err, result) => {
          if (err) {
            callback(err);
            return;
          }
    
          const { Items, LastEvaluatedKey, Count, ScannedCount } = result;
    
          fullResult.Items = [...fullResult.Items, ...Items];
          fullResult.Count += Count;
          fullResult.ScannedCount += ScannedCount;
    
          if (!LastEvaluatedKey) {
            callback(null, fullResult);
            return;
          }
    
          params.ExclusiveStartKey = LastEvaluatedKey;
          queryExecute(callback);
        });
      }
    
      queryExecute(callback);
    }
    

    Unfortunately, this isn't a complete solution. In my situation, a query for a mere 130 items (which require 4 actual fetches) takes about 15 seconds.