Search code examples
python-3.xamazon-web-servicesamazon-dynamodbdynamodb-queries

How to get total count from a Dynamo DB based on the ttl value


I am trying to get how many data are present in Dynamo DB based on the ttl column value. In the source, we could see only 55k events are there. but in the ddb table we could see 122k events are there. By default our ttl value is 4 days. Hence, based on particular period we want to see how many events/data present in the ddb table. we want to ignore the data where ttl value is less than current time. I'm new to AWS. So, kindly help me with detailed code. Thank you :)

Primary key : customer_id (String)
ttl column name : ttl (TTL)
other generic column names are record_insert_time, record_update_time etc..

I tried below code. But getting error as in the below

import boto3

dynamodb = boto3.client('dynamodb')

table_name = 'table name' # i used my table name here.
ttl_threshold = 1631303477  


count = 0

query_params = {
    'TableName': 'table name', # i used my table name here.
    'KeyConditionExpression': 'TTLAttributeName > :threshold',
    'ExpressionAttributeValues': {
        ':threshold': {'N': str(ttl_threshold)}
    }
}


response = dynamodb.query(**query_params)

count += response['Count']

while 'LastEvaluatedKey' in response:
    query_params['ExclusiveStartKey'] = response['LastEvaluatedKey']
    response = dynamodb.query(**query_params)
    count += response['Count']

print(f"Total count of items with TTL greater than {ttl_threshold}: {count}")

Below is the Error:

ClientError                               Traceback (most recent call last)
<ipython- in <module>
     19 
     20 
---> 21 response = dynamodb.query(**query_params)
     22 
     23 count += response['Count']

~/anaconda3/python3.7/site-packages/botocore/client.py in _api_call(self, *args, **kwargs)
    513                 )
    514             # The "self" in this scope is referring to the BaseClient.
--> 515             return self._make_api_call(operation_name, kwargs)
    516 
    517         _api_call.__name__ = str(py_operation_name)

~/anaconda3/envs/JupyterSystemEnv/lib/python3.7/site-packages/botocore/client.py in _make_api_call(self, operation_name, api_params)
    932             error_code = parsed_response.get("Error", {}).get("Code")
    933             error_class = self.exceptions.from_code(error_code)
--> 934             raise error_class(parsed_response, operation_name)
    935         else:
    936             return parsed_response

ClientError: An error occurred (ValidationException) when calling the Query operation: Query condition missed key schema element: customer_id

Solution

  • You cannot use Query as that relies on the partition key being the value you are searching on, which it is not. You will need to use Scan API to read and filter all of the items, you may also need to introduce pagination if you have more than 1MB of data in the table.

    import boto3
    from boto3.dynamodb.conditions import Attr
    from botocore.exceptions import ClientError
    
    session = boto3.session.Session()
    client = session.client('dynamodb')
    
    table_name = 'table name'
    ttl_threshold = 1631303477 
    
    
    try:
        response = client.scan(
            TableName=table_name, 
            FilterExpression = Attr('Number_Attribute').gte(ttl_threshold)
        )
    
        print(len(response['Items']))
    
    except ClientError as error:
        print(error.response['ResponseMetadata'])