Search code examples
indexingplonezodb

Plone - ZODB catalog query sort_on multiple indexes?


I have a ZODB catalog query with a start and end date. I want to sort the result on end_date first and then start_date second.

Sorting on either end_date or start_date works fine.

I tried with a tuple (start_date,end_date), but with no luck.

Is there a way to achieve this or do one have to employ some custom logic afterwards?


Solution

  • The generalized answer ought to be post-hoc-sort of your entire result set of catalog brains, use zope.sequencesort (via PyPI, but already shipped with Plone) or similar.

    The more complex answer is a rabbit-hole of optimizations that you should only go down if you know you need to and know what you are doing:

    1. Make sure when you do sort the brains that your user gets a sticky session to the same instance, at least for cache-affinity to get the same catalog indexes and brains (metadata);
    2. You might want to cache across requests (thread-global) a unique session id, and a sequence of catalog RID (integer) values for your entire sorted request, should you expect the user to come back and need in subsequent batches. Of course, RIDs need to be re-constituted into ZCatalog's lazy-sequences of brains, and this requires some know-how (or reading the source).
    3. Finally, for large result (many thousands) sets, I would suggest that it is reasonable to make application-specific compromises that approximate correct by post-hoc sorting of the current batch through to the end of the n-batches after it, where n is inversely proportional to the len(site.portal_catalog.uniqueValuesFor(indexnamehere)). For a large set of results, the correctness of an approximated secondary-sort is high for high-variability, and low for low variability (many items with same secondary value, such that count is much larger than batch size can make this frustrating).

    Do not optimize as such unless you are dealing with particularly large result sets.

    It should go without saying: if you do optimize, you need to verify that you are actually getting a superior result (profile and benchmark). If you cannot justify investing the time to do this, you cannot justify optimizing.