I am trying to update multiple DynamoDB tables with a new column that will contain values. The DDB tables contain over 10 million items. I'm unable to use the BatchWriteItem
boto3 method as that overwrites the entire item and I need to preserve the existing items.
I've attempted to use the UpdateItem
boto3 method but it is very slow for updating this many items.
Questions:
UpdateItem
calls instead of having to send millions of calls?Any help is appreciated, thank you!
There are some ways you can speed things up:
Using BatchExecuteStatement
will allow you to do a batch update of up to 25 items in a single request.
Use Parallel Scan to retrieve the keys and multiple threads to parallelize your work
Use AWS Glue to provide distributed compute, this is similar to #2 but you have a lot more processing power by using Spark distribution.