Search code examples
randomsphinx

Is there a way to randomize search results (record ids) with Sphinx?


I have a complex SphinxQL query which, at the end, orders results by a specific field, Preferred, so that all records with that indexed value of Preferred=1 come before all records w Preferred=0. I also order by weight() so basically I end up with:

Select * from idx_X where MATCH('various parameters') ORDER by Preferred DESC,Weight() Desc

The problem is that, though Preferred records come first I end up with records sorted by ID which puts results from one field, Vendor, in blocks so for instance I get:

Beta Shipping Beta Shipping Beta Shipping Acme Widgets Acme Widgets Acme Widgets Acme Widgets Acme Widgets

Which doesn't serve my purposes in this case well (often one 'Vendor' will have 1000 results)

So I'm looking to essentially do:

ORDER BY Preferred DESC,weight() DESC,ID RANDOM

So that after getting to Preferred Vendors whose weight is (e.g.) 100, I will get random Vendors vs blocks of them.

Update: Though I did find what appears to be a possible answer in another Stackoveflow Question

The issue is it seems to require the SPH_SORT_EXTENDED and I am forced to use SPH_RANK_PROXIMITY (ranker=proximity) and I am unclear if I can combine ranking and sorting.

Update 2: If I remove my existing two-level Order and just do Order by Rand() it indeed returns random IDs. However I cannot add Rand() after Order by Preferred DESC,Weight() DESC or I get the following error:

1064 - sphinxql: syntax error, unexpected '(', expecting $end near '()


Solution

  • Sadly yes, RAND() only works as a single sort order expression, but it DOES work as a select function....

    Select *, RAND() AS r from idx_X where MATCH('various parameters') 
     ORDER by Preferred DESC,Weight() Desc, r DESC
    

    Or if want a more consistent ordering, but still mixed, can for example use CRC32() function on a string atribute

    Select *, CRC32(title) AS r from idx_X where MATCH('various parameters') 
     ORDER by Preferred DESC,Weight() Desc, r DESC
    

    Can also just limit results to a few per vendor (vendor will need to be an attribute)

    Select * from idx_X where MATCH('various parameters') 
     GROUP 3 BY vendor_id ORDER by Preferred DESC,Weight() Desc
    

    Group by N is a little known by very useful sphinx feature.