Search code examples
pythonamazon-dynamodbbotoboto3

How to batch_get_item many items at once given a list of primary partition key values


So, so I have a dynamodb table with a primary partition key column, foo_id and no primary sort key. I have a list of foo_id values, and want to get the observations associated with this list of ids.

I figured the best way to do this (?) is to use batch_get_item(), but it's not working out for me.

    # python code
    import boto3
    client = boto3.client('dynamodb')

    # ppk_values = list of `foo_id` values (strings) (< 100 in this example)
    x = client.batch_get_item(
        RequestItems={
            'my_table_name':
                {'Keys': [{'foo_id': {'SS': [id for id in ppk_values]}}]}
        })

I'm using SS because I'm passing a list of strings (list of foo_id values), but I'm getting:

ClientError: An error occurred (ValidationException) when calling the
BatchGetItem operation: The provided key element does not match the
schema

So I assume that means it's thinking foo_id contains list values instead of string values, which is wrong.

--> Is that interpretation right? What's the best way to batch query for a bunch of primary partition key values?


Solution

  • The keys should be given as mentioned below. It can't be mentioned as 'SS'.

    Basically, you can compare the DynamoDB String datatype with String (i.e. not with SS). Each item is handled separately. It is not similar to SQL in query.

    'Keys': [
                {
                    'foo_id': key1
                },
                {
                    'foo_id': key2
                }
    ], 
    

    Sample code:-

    You may need to change the table name and key values.

    from __future__ import print_function # Python 2/3 compatibility
    import boto3
    import json
    import decimal
    from boto3.dynamodb.conditions import Key, Attr
    from botocore.exceptions import ClientError
    
    # Helper class to convert a DynamoDB item to JSON.
    class DecimalEncoder(json.JSONEncoder):
        def default(self, o):
            if isinstance(o, decimal.Decimal):
                if o % 1 > 0:
                    return float(o)
                else:
                    return int(o)
            return super(DecimalEncoder, self).default(o)
    
    dynamodb = boto3.resource("dynamodb", region_name='us-west-2', endpoint_url="http://localhost:8000")
    
    email1 = "abc@gmail.com"
    email2 = "bcd@gmail.com"
    
    try:
        response = dynamodb.batch_get_item(
            RequestItems={
                'users': {
                    'Keys': [
                        {
                            'email': email1
                        },
                        {
                            'email': email2
                        },
                    ],            
                    'ConsistentRead': True            
                }
            },
            ReturnConsumedCapacity='TOTAL'
        )
    except ClientError as e:
        print(e.response['Error']['Message'])
    else:
        item = response['Responses']
        print("BatchGetItem succeeded:")
        print(json.dumps(item, indent=4, cls=DecimalEncoder))