Search code examples
pythonpython-2.7google-app-engineapp-engine-ndb

Query on children table and order result of a query by parent's fields at NDB


At my model, I have two tables like the ones below that are related:

from google.appengine.ext import ndb

class Parent(ndb.Model):
    name = ndb.StringProperty()

class Child(ndb.Model):
    parent = ndb.KeyProperty(kind=Parent, indexed=True)
    name = ndb.StringProperty()

I want to query on Child table and order children records by the parent name ( ascending or descending), How can I do this with NDB ORM?

As much as possible, I do not want this process to be done at the code level with the help of Python.


Solution

  • In general @gaefan is correct, you can't do this properly.

    There's only 3 ways I can see to kind of make this work.

    1. the parent name has to also be the key for parent, then you could order Child.query().order(Child.parent).fetch(). The downside is that changing the name of parent is not really possible. You'd have to change its key and all foreign keys on all Child models and other models that had a foreign key to it

    2. denormalize parent name as a field on Child, like this:

      class Child(ndb.Model): parent = ndb.KeyProperty(kind=Parent, indexed=True) name = ndb.StringProperty() parent_name = ndb.StringProperty()

    Then you could do Child.query().order(Child.parent_name).fetch(). The downside is again that if you want to change Parent.name, you'd have to then update all of it's children.

    1. If this is for something like a http handler and you need to return a page of 10 results or something, you could first fetch 10 parents ordered by name, then run a query for all children who have one of those parents Child.query(namespace="test").filter(Child.parent.IN(order_parent_keys)).fetch() (this is what you suggested in your comment). You would just then have to order the children after the fact by parent name. The issue with this method, is this only works for a limited number of parents at a time (I forget how many values you can pass to .IN()). Also the pagination would get weird when you run out of children for the original group of order_parent_keys and need to get the next set.

    The right solution for you really depends on your data. How many parents do you have, how many children do you expect each parent to have. Also, how much data you are trying to process, and how fast you have to process it.

    For example if you need to process your all or most of your db once a day, Google Dataflow would be perfect for this.