I have a DynamoDB table and I want to output items from it to a client using pagination. I thought I'd use DynamoDB.Paginator.Scan and supply StartingToken
, however I dont see NextToken
in the output of either page
or iterator
itself. So how do I get it?
My goal is a REST API where client requests next X items from a table, supplying StartingToken to iterate. Originally there's no token, but with each response server returns NextToken
which client supplies as a StartingToken
to get the next X items.
import boto3
import json
table="TableName"
client = boto3.client("dynamodb")
paginator = client.get_paginator("query")
token = None
size=1
for i in range(1,10):
client.put_item(TableName=table, Item={"PK":{"S":str(i)},"SK":{"S":str(i)}})
it = paginator.paginate(
TableName=table,
ProjectionExpression="PK,SK",
PaginationConfig={"MaxItems": 100, "PageSize": size, "StartingToken": token}
)
for page in it:
print(json.dumps(page, indent=2))
break
As a side note - how do I get one page from paginator without using break/for? I tried using next(it)
but it does not work.
Here's it
object:
{
'_input_token': ['ExclusiveStartKey'],
'_limit_key': 'Limit',
'_max_items': 100,
'_method': <bound method ClientCreator._create_api_method.<locals>._api_call of <botocore.client.DynamoDB object at 0x000001CBA5806AA0>>,
'_more_results': None,
'_non_aggregate_key_exprs': [{'type': 'field', 'children': [], 'value': 'ConsumedCapacity'}],
'_non_aggregate_part': {'ConsumedCapacity': None},
'_op_kwargs': {'Limit': 1,
'ProjectionExpression': 'PK,SK',
'TableName': 'TableName'},
'_output_token': [{'type': 'field', 'children': [], 'value': 'LastEvaluatedKey'}],
'_page_size': 1,
'_result_keys': [{'type': 'field', 'children': [], 'value': 'Items'},
{'type': 'field', 'children': [], 'value': 'Count'},
{'type': 'field', 'children': [], 'value': 'ScannedCount'}],
'_resume_token': None,
'_starting_token': None,
'_token_decoder': <botocore.paginate.TokenDecoder object at 0x000001CBA5D81960>,
'_token_encoder': <botocore.paginate.TokenEncoder object at 0x000001CBA5D82290>
}
And the page:
{
"Items": [
{
"PK": {
"S": "2"
},
"SK": {
"S": "2"
}
}
],
"Count": 1,
"ScannedCount": 1,
"LastEvaluatedKey": {
"PK": {
"S": "2"
},
"SK": {
"S": "2"
}
},
"ResponseMetadata": {
"RequestId": "DBE4ON8SI0GOTS2RRO2OG43QJVVV4KQNSO5AEMVJF66Q9ASUAAJG",
"HTTPStatusCode": 200,
"HTTPHeaders": {
"server": "Server",
"date": "Fri, 30 Dec 2022 11:37:52 GMT",
"content-type": "application/x-amz-json-1.0",
"content-length": "121",
"connection": "keep-alive",
"x-amzn-requestid": "DBE4ON8SI0GOTS2RRO2OG43QJVVV4KQNSO5AEMVJF66Q9ASUAAJG",
"x-amz-crc32": "973385738"
},
"RetryAttempts": 0
}
}
I thought I could use LastEvaluatedKey
but that throws an error, also tried to get token like this, but it did not work:
it._token_encoder.encode(page["LastEvaluatedKey"])
I also thought about using just scan
without iterator, but I'm actually outputting a very filtered result-set. I need to set Limit
to a very large value to get results and I don't want too many results at the same time. Is there a way to scan up to 1000 items but stop as soon as 10 items are found?
I would suggest not using paginator
but rather just use the lower level Query
. The reason being is the confusion between NextToken
and LastEvaluatedKey
. These are not interchangeable.
LastEvaluatedKey
is passed to ExclusiveStartKey
NextToken
is passed to StartToken
It's preferrable to use the Resource Client which I believe causes no confusing on how to paginate
import boto3
dynamodb = boto3.resource('dynamodb', region_name=region)
table = dynamodb.Table('my-table')
response = table.query()
data = response['Items']
# LastEvaluatedKey indicates that there are more results
while 'LastEvaluatedKey' in response:
response = table.query(ExclusiveStartKey=response['LastEvaluatedKey'])
data.update(response['Items'])