Search code examples
google-cloud-datastoreapp-engine-ndb

Bulk writes when one object requires the Key of another


Trying to figure out a good solution to this. Using Python and the NDB library.

I want to create an entity, and that entity is also tied to another entity. Both are created at the same time. Example would be creating a Message for a large number of users. We have an Inbox table/kind, and a Message table.

So once we gather the Keys of all the users we want, what I'm doing is just creating the Inbox entity, saving it, and then using the provided Key that it returns and attaching to the Message, and then saving the Message. For a large number of users, this seems pretty expensive. 2 writes per user. Normally I would just create the objects themselves and then use ndb.put_multi() to just batch the writes. Since there is no Key until it's saved, I can't do that.

Hope that made sense. Ideas?


Solution

  • Take a look at the allocate_ids API. You can pass a parent key and get id assigned. The allocate_ids call guarantees that the id is never reused within that parent key context. allocate_ids is a small operation and fast. Once you allocate these ids, then you can do the put_multi by referencing the allocated ids in the other entities that refer them. As I understand the message entity itself is not being referenced and if so you only need to allocate ids for Inbox (presumably if the user already doesn't have one) and do a multi_put on both inbox and message entities.