Search code examples
syntaxpaginationlucene.net

lucene.net paging using query?


I'm using lucene.net to produce an index and search it. I'm actually using the API indirectly through the Examine project on codeplex. I currently have everything working and the paging logic in place, however the current logic pages the results after the search has been completed. I don't like this because it means the search will possibly return thousands of records and only then does my code take the 10-20 records it needs and discards the rest which is a major waste of resources. Even if each SearchResult item is just a tiny 3KB the amount of memory to execute these searches will grow with time and become a huge memory hog. My shared host is only guaranteeing 1GB of dedicated memory so this is a big concern for my website.

So the question is: How do i limit the results of the results in a paged manner using lucene query language alone? I looked at the apache lucene project, which lucene.net is ported from, and I don't see any syntax that lets me do what I'm looking for. Basically I want the equivalent of what sql server has to limit the rows at the query language level.

E.g. (this is how we would do paging in sql and it only returns 20 records not every record that matches the where clause)

Select * from (select Row_Number() OVER (ORDER BY OrderDate) as RoNum, OrderID, OrderDate FROM SalesOrders WHERE OrderCustomerName like 'Davis%') O WHERE RowNum BETWEEN 1 and 20


Solution

  • I don't think that there is a major waste of resources, since search is (making it simple) nothing more than calculating the Bitvector & scores. What is costly is the reading of docs from the index. (Except the deprecated Hits class) search results don't read the docs, instead just return the docid's, so there isn't much overhead in skipping the first N result.

    The exception for this is when you want to sort the result according to some field. Then all docs in the search result list must be read from the index, to be able to return them in correct order.