Search code examples
pythongoogle-app-enginegoogle-cloud-datastoreapp-engine-ndb

Ndb strong consistency and frequent writes


I'm trying to achieve strong consistency with ndb using python. And looks like I'm missing something as my reads behave like they're not strongly consistent.

The query is:

links = Link.query(ancestor=lead_key).filter(Link.last_status == 
None).fetch(keys_only=True)

if links: 
    do_action() 

The key structure is:

Lead root (generic key) -> Lead -> Website (one per lead) -> Link

I have many tasks that are executed concurrently using TaskQueue and this query is performed at the end of every task. Sometimes I'm getting "too much contention" exception when updating the last_status field but I deal with it using retries. Can it break strong consistency?

The expected behavior is having do_action() called when there are no links left with last_status equal to None. The actual behavior is inconsistent: sometimes do_action() is called twice and sometimes not called at all.


Solution

  • Using an ancestor key to get strong consistency has a limitation: you're limited to one update per second per entity group. One way to work around this is to shard the entity groups. Sharding Counters describes the technique. It's an old article, but as far as I know, the advise is still sound.