Search code examples
pythondatabasegoogle-app-engineapp-engine-ndbgoogle-app-engine-python

maximum recursion depth exceeded appears when google ndb tasklet tries to update the data


I was joined to project with a lot of changes frequency, developments have already been done in the project and it has looked like a monolith architecture! sometimes we are using some fields at the ancestors model in a child model to handle some kind of filtering and sorting features (I asked about it before and no solution was introduced to me other than using duplicate fields. LINK) therefore we need to update these ancestors fields when a parent was updating, regarding the project is large, I decided to use the model signals instead of finding all places and updating the source code to update these fields in the new model, I did this feature with _post_put_hook and _pre_put_hook signals of google ndb model class.

class ParentOne(ndb.Model):
    name_parent_one = ndb.StringProperty()
    ....

   def _post_put_hook(self, future):
       obj = JoinParentNameClasses.query(JoinParentNameClasses.parent_one == self.key).get()
       obj.name_parent_one = self.name_parent_one
       obj.put()

class ParentTwo(ndb.Model):
    name_parent_two = ndb.StringProperty()
    ....

   def _post_put_hook(self, future):
       obj = JoinParentNameClasses.query(JoinParentNameClasses.parent_two == self.key).get()
       obj.name_parent_two = self.name_parent_two
       obj.put()

class JoinParentNameClasses(ndb.Model):
    parent_one = ndb.KeyProperty(kind='ParentOne')
    name_parent_one = ndb.StringProperty()
    parent_two = ndb.KeyProperty(kind='ParentTwo')
    name_parent_two = ndb.StringProperty()
    ... some other fields which was used for API ....

But now bigger problems have emerged: when someone tries to use put_multi or put_multi_async of google ndb, NDB create a lot of future objects and sends them for processing with ndb.tasklet after that with .get_result() gets latest update result, according to ndb.tasklet implementation, it's seems like python coroutines when we have a lot of rows which is needed to be updated then regarding _post_put_hook that updated the children, create a lot of depth and maximum recursion depth exceeded appears. How can I solve the problem?

Note. I know about sys.setrecursionlimit(1000) but this isn't a good solution, I'm looking for best practice.


Solution

  • Using _post_put_hook seems to be asking for trouble... Serial puts is a bad idea because it is slow and increases datastore contention. You might have created an infinite loop as well.

    Instead, you want to batch your puts together:

    class ParentOne(ndb.Model):
    
        def put_me(self):
            obj = JoinParentNameClasses.query(JoinParentNameClasses.parent_one == self.key).get()
            obj.name_parent_one = self.name_parent_one
            ndb.put_multi([self, obj])
    

    You mention that you are already using ndb.put_multi which is triggering all the _post_put_hook calls. Instead, write a custom function that

    • receives a list of objects
    • gets all the relevant parent objects
    • puts them all with a single ndb.put_multi