I have exported and transformed 340 million rows from DynamoDB into S3. I am now trying to import them back into DynamoDB using the Data Pipeline.
I have my table write provisioning set to 5600 capacity units and I can't seem to get the pipeline to use more than 1000-1200 of them (really difficult to say the true number because of the granularity of the metric graph.
I have tried to increase the number of the slave nodes as well as the size of the instance for each slave node, but nothing seems to make a difference.
Does anyone have any thoughts?
The problem was there was a secondary index on the table. Regardless of the write provisioning level that I chose and the number of machines in the EMR, I couldn't get more than 1000 or so. I had the level set to 7000 so 1000 is not acceptable.
As soon as I removed the secondary index, the write provisioning maxed out.