I've got Kafka topic which contains catalog data with following commands:
Now I need to consume this topic, possible streaming 100k msgs/sec, to some DB, which will help me to translate original stream of commands to stream of item states. So there will be only current item state from DB. Basically DB will be used a lookup.
My idea was:
My worries are about ACID of Datastore. How "ACID" is it? Is it even suitable for such use-case?
I was also thinking about using cheaper BigTable, but that doesn't seems right choice for this use-case.
If you have any ideas/recommendations how else to solve this issue, I'll be glad.
Bigtable can handle the rate of 100K with a 10 node cluster (I have run tests up to 3,500 nodes, which handles 35M updates per second). Bigtable has strong consistency for a single row upserts
. Bigtable users design schemas that fit all of their transactional data into a single row.
Cloud Bigtable supports upserts
, and does not have a distinction between insert
and update
. There is also a delete by range that could theoretically be used for your delete_all
case.
The high transaction rate and the lower cost are the right reasons to use Cloud Bigtable. Alternatively, you can consider using Cloud Spanner which is meant for high throughput transactional data.