Search code examples
flaskamazon-dynamodbboto3

Flask app with boto3 (Dynamodb) - inconsistent data retrieval


I have a AWS IoT configuration rule to save messages from MQTT to a DynamoDB v2 table. The saving process is working fine - I've an arduino running a code that is sending data to MQTT and normally saving data to the DynamoDb.

I also have an application (flask) running on a EC2. The problem is that the data retrieval is totally inconsistent. At this moment I'm writing, if I perform a scan using AWS console, I have 58 entries (I have to click on "retrieve next page", otherwise I would only see 50 items). But when I call my app , sometimes I see 50 items, sometimes 54... sometimes even 58!

If I restart apache2 server, it appears that will correctly count.

Below is the code I'm running. Any thoughts?

from flask import Flask
import boto3
from boto3.dynamodb.conditions import Key, Attr
from datetime import datetime

AWS_ACCESS_KEY_ID = 'key'
AWS_SECRET_ACCESS_KEY = 'access_key'
REGION_NAME = 'region'

dynamodb = boto3.resource(
    'dynamodb',
    aws_access_key_id     = AWS_ACCESS_KEY_ID,
    aws_secret_access_key = AWS_SECRET_ACCESS_KEY,
    region_name           = REGION_NAME
)

table = dynamodb.Table('my_db')

lastEvaluatedKey = None
items = []

#I'm doing this below to get all pages
while True:
    if lastEvaluatedKey == None:
        response = table.scan()
    else:
        response = table.scan(
        ExclusiveStartKey=lastEvaluatedKey
    )

    items.extend(response['Items'])

    if 'LastEvaluatedKey' in response:
        lastEvaluatedKey = response['LastEvaluatedKey']
    else:
        break

#localDateTime is my sortKey
sorted_items = sorted(items, key=lambda x: datetime.strptime(x['localDateTime'], '%Y-%m-%dT%H:%M:%S'), reverse=True)

app = Flask(__name__)
@app.route('/')

def my_app():
    return sorted_items
if __name__ == '__main__':
  app.run()

I'm trying to get all data from dynamoDb

I also tried using this AWS example, with paginator, but had the same issue. First time I run after restarting apache2, I have the right number of items. Then I add some items, but they don't appear until I restart apache2:

from flask import Flask
import boto3
from boto3.dynamodb.conditions import Key, Attr
from datetime import datetime

AWS_ACCESS_KEY_ID = 'key'
AWS_SECRET_ACCESS_KEY = 'secret'
REGION_NAME = 'region'

dynamodb = boto3.client(
    'dynamodb',
    aws_access_key_id     = AWS_ACCESS_KEY_ID,
    aws_secret_access_key = AWS_SECRET_ACCESS_KEY,
    region_name           = REGION_NAME
)

paginator = dynamodb.get_paginator('query')
response = paginator.paginate(TableName='my-table',
        KeyConditionExpression="id = :id",
        ExpressionAttributeValues= {
            ":id": {
                "S": "my_id"
                }
            }
        )

items = []

for page in response:
    items.append(page['Items'])


app = Flask(__name__)
@app.route('/')

def my_app():
    return items[0]
if __name__ == '__main__':
  app.run()

Solution

  • I think your pagination logic is amiss. Try this implementation:

    response = table.scan()
    data = response['Items']
    
    while 'LastEvaluatedKey' in response:
        response = table.scan(ExclusiveStartKey=response['LastEvaluatedKey'])
        data.extend(response['Items'])