Search code examples
pythongoogle-app-enginegoogle-cloud-datastoreapp-engine-ndbwebapp2

How to compare multiple dates on an NDB query?


I need to fetch objects on an NDB queries that match a given start and end date, but I'm not able to do this traditionally simple query because NDB is complaining:

from google.appengine.ext import ndb
from datetime import datetime
from server.page.models import Post

now = datetime.now()
query = Post.query(
    Post.status == Post.STATUS_ACTIVE,
    Post.date_published_start <= now,
    Post.date_published_end >= now,
)
count = query.count()

Error:

BadRequestError: Only one inequality filter per query is supported.
Encountered both date_published_start and date_published_end

Is there any workarounds for this?


Solution

  • Dynamically obtaining a single result list that can be directly used for pagination without any further processing is not possible due to the limitation of a single inequality filter per query limitation. Related GAE 4301 issue.

    As Jeff mentioned, filtering by one inequality (ideally the most restrictive one) followed by further dynamic processing of the results is always an option, inefficient as you noted, but unavoidable if you need total flexibility of the search.

    You could improve the performance by using a projection query - reducing the amount of data transfered from the datastore to just the relevant properties.

    You could also try to perform 2 keys-only queries, one for each inequality, then compute the intersection of the results - this could give you the pagination counts and list of entities (as keys) faster. Finally you'd get the entities for the current page by direct key lookups for the keys in the page list, ideally batched (using ndb.get_multi()).

    Depending on the intended use you might have other alternatives in some cases (additional work required, of course).

    You could restrict the scope of the queries. Instead of querying all Post entities since the begining of time maybe just results in a certain year or month would suffice in certain cases. Then you could add the year and/or month Post properties which you can include as equality filters in your queries, potentially reducing the number of results to process dynamically from thousands to, say, hundreds or less.

    You could also avoid the queries altogether for typical, often-use cases. For example if the intended use is to generate a few kinds of monthly reports you could have some Report entities containing lists of Post keys for each such report kind/month which you could update whenever a Post entity's relevant properties change. Instead of querying Posts entities for a report you'd instead just use the already available lists from the respective Report entity. You could also store/cache the actual report upon generation, for direct re-use (instead of re-generating it at every access).