Search code examples
javagoogle-app-enginecursorgoogle-cloud-datastoreobjectify

Are GAE Datastore cursors permanent and durable?


Is it correct to say that a com.google.appengine.api.datastore.Cursor simply stores an index position into a GAE Datastore index?

Are cursors durable? That is, can I store a cursor permanently and reuse it again and again knowing for sure that if it was pointing to 5000th position in the index, that's where it'll point forever?

What if the index shrinks to less than 5000 entries? Will using this cursor cause an error or simply return nothing?

For larger indexes (say 100,000 or more entries), can I first acquire cursors for every multiple-of-5000th position (say), store them and then use this set of cursors to do some work in a Map/Reduce manner?

I am actually using Objectify and not the DS directly, but AFAIK this will not affect the properties of Cursors vis-a-vis Indexes.


Solution

  • Cursors only make sense in the context of the original query that was made. They are not exactly index positions/offsets. From Cursors and data updates:

    The cursor's position is defined as the location in the result list after the last result returned. A cursor is not a relative position in the list (it's not an offset); it's a marker to which Cloud Datastore can jump when starting an index scan for results. If the results for a query change between uses of a cursor, the query notices only changes that occur in results after the cursor. If a new result appears before the cursor's position for the query, it will not be returned when the results after the cursor are fetched. Similarly, if an entity is no longer a result for a query but had appeared before the cursor, the results that appear after the cursor do not change. If the last result returned is removed from the result set, the cursor still knows how to locate the next result.

    Also from Limitations of cursors:

    Cursors are subject to the following limitations:

    • A cursor can be used only by the same application that performed the original query, and only to continue the same query. To use the cursor in a subsequent retrieval operation, you must reconstitute the original query exactly, including the same entity kind, ancestor filter, property filters, and sort orders. It is not possible to retrieve results using a cursor without setting up the same query from which it was originally generated.
    • Because the NOT_EQUAL and IN operators are implemented with multiple queries, queries that use them do not support cursors, nor do composite queries constructed with the CompositeFilterOperator.or method.
    • Cursors don't always work as expected with a query that uses an inequality filter or a sort order on a property with multiple values. The de-duplication logic for such multiple-valued properties does not persist between retrievals, possibly causing the same result to be returned more than once.
    • New App Engine releases might change internal implementation details, invalidating cursors that depend on them. If an application attempts to use a cursor that is no longer valid, Cloud Datastore raises an IllegalArgumentException (low-level API), JDOFatalUserException (JDO), or PersistenceException (JPA).

    If your data doesn't change you're probably OK using cursors in a map/reduce manner (by restoring the original query), including pre-acquiring them.