Search code examples

Datastore in Firestore mode - a distributed counter than can scale it's shards up based on traffic

In Datastore in Firestore mode the recommended way to deal with storing a high write counter (such as profile views on a website) is to use sharded/distributed counters.

The problem I have is that with distributed counters you need to pick how many shards you want to have. This is addressed here as well. For example some profiles may get a lot more views per second than others (one profile may be a famous person while another is a regular person), and therefore need more shards.

Is there a way to write a distributed counter that can scale it's shards up if the page is getting a lot of views per second?

I was thinking of detecting a datastore contention error and then adding more shards if that happens.

I noticed there is a new extension for Cloud Firestore that seems to do what I am asking for. However, I am not using Cloud Firestore, I am using Datastore in Firestore mode - similar under the hood but still different.


  • The original Datastore distributed counters example:

    NUM_SHARDS = 20
    class SimpleCounterShard(ndb.Model):
        """Shards for the counter"""
        count = ndb.IntegerProperty(default=0)
    def get_count():
        """Retrieve the value for a given sharded counter.
            Integer; the cumulative count of all sharded counters.
        total = 0
        for counter in SimpleCounterShard.query():
            total += counter.count
        return total
    def increment():
        """Increment the value for a given sharded counter."""
        shard_string_index = str(random.randint(0, NUM_SHARDS - 1))
        counter = SimpleCounterShard.get_by_id(shard_string_index)
        if counter is None:
            counter = SimpleCounterShard(id=shard_string_index)
        counter.count += 1

    Used a fixed number of shards, but the Firestore example uses a separate entity for keeping track of the number of shards. So, you can update the code above with something like:

    class RootCounter(ndb.Model):
      count = ndb.IntegerProperty(default=0)
      num_shards = ndb.IntegerProperty(default=0)
      def get_count(self):
        if self.num_shards > 0:
          return sum([e.count for e in SimpleCounterShard.query(parent=self.key)])
        return count
      def increment(self):
          self.num_shards += 1
      def _increment(self):
        if self.num_shards > 0:
          SimpleCounterShard.increment(parent=self.key, self.num_shards)
          self.count += 1

    The important difference since Firestore in Datastore mode has been released is that Firestore in Datastore mode is strongly consistent and that you are likely not using entity groups. Thus a query will give an exact answer, and the sharded counters can nicely fit in the hierarchy with the root counter.