I have data coming into two DynamoDB tables. Let's call them Widgets and Kerfuffles. Each Widget "has a" Kerfuffle, but a Kerfuffle could belong to several Widgets. Now normally, I'd say I could use DDB Streams to kick off a lambda to publish my Widget-Kerfuffle pair to SNS. However, Widgets and their Kerfuffles don't necessarily arrive together. In fact, the Kerfuffle could arrive 5-10 minutes before or after the Widget.
So it would seem like I can't just have a lambda trigger on the Widget or the Kerfuffle being Created because the other half might not be present (and I don't want to send down duplicate Widgets either).
Any suggestions on how to handle this?
Typing is hard. Let widget = A
and kerfuffle = B
.
Real-time: you process notifications off of new A
's and new B
's. For each A
notification, you check whether B
is present. If it isn't stop. Else, process that A
. For each B
notification, you collect all present A
's matching it, and process them all. Note that you'll need some sort of locking here if you want to avoid processing A
's multiple times if they trigger very close to their B
and both processes succeed.
Near-real-time: once in a while (every t
minutes), find all A
's that have not been processed. Process all those that have matching B
's, and mark those A
's as processed.
Method 1:
A
's that don't have B
's yet.Method 2:
t
minutes. This can be inconsequential or extremely impractical, depending on your application.