Search code examples
google-cloud-platformgoogle-cloud-datastoregqlgoogle-query-languagegqlquery

Google Cloud Datastore: how to query all distinct Parent/Ancestor?


I have a Datastore Kind named Order, which has an ancestor/parent User.

I'd like to query all distinct ancestors (users) of the orders using GQL, but the following query doesn't work.

SELECT DISTINCT User FROM Order

The query's response is:

No entities matched this query.

Make sure there are either simple or composite indexes for the properties you are searching. Learn more

Since the parent is also part of the key, I also tried:

SELECT DISTINCT __key__ FROM Order

But the error response said:

GQL query error: Group by is not supported for the property: key


Solution

  • You should note that the datastore ancestry is not established at the entity kind level: you can't really say that the Order kind has a User kind as ancestor.

    The ancestry is established at the entity level - an entity has an ancestor only if one is specified at the entity creation level, otherwise it doesn't have one. Also it doesn't matter what kind the ancestor entity is, different entities of the same kind can have ancestors of different kinds or no ancestors at all.

    With this clarification in mind it sounds like each of your Order entities have a User entity as ancestor.

    The presence of an ancestry relationship places all related entities into the same entity group. All entities without an ancestry are each placed into their own entity group (they're entity group roots/leaders).

    In your case the Order entities are placed in the entity groups of their respective User entities.

    When an ancestor query is made (i.e. either a specific ancestor or descendant entity is specified), the query results will be limited to the scope of that specific entity group only. This allows such queries to be made in a transactional manner, with strongly consistent results.

    For an example of the syntax for an ancestor query see HAS ANCESTOR and HAS DESCENDANT clauses in google cloud datastore.

    The downside is that you can't make ancestor queries spanning multiple entity groups. In your case you're querying for User entities, which are in different entity groups. Even if you'd be placing all User entities in the same group (by specifying a common ancestor key for them) you'd still can't get what you want since you're looking for different ancestors by Order, which is a "descendant condition" - an Order can only have one User as ancestor.

    This brings us to the root cause of your problem: you're using entity ancestry to model entity relationship. That's not what the ancestry is designed for, it is designed for strong consistency. I know, it sounds confusing.

    What you can do is forget about the datastore ancestry and use plain key properties to model your relationships, with no restrictions. See also E-commerce Product Categories in Google App Engine (Python)

    I'd add an order_count property to the User kind and a user key property to the Order kind. Whenever an order is created I'd create an Order entity with its user property set to the respective User entity key and I'd increment the order_count property of that User entity. Then, to get what you want, you simply need to query for User entities which have a non-zero order_count.