Search code examples
lucenescoring

emit every document in the database with lucene


I've got an index where I need to get all documents with a standard search, still ranked by relevance, even if a document isn't a hit.

My first idea is to add a field that is always matched, but that might deform the relevance score.


Solution

  • Use a BooleanQuery to combine your original query with a MatchAllDocsQuery. You can mitigate the effect this has on scoring by setting the boost on the MatchAllDocsQuery to zero before you combine it with your main query. This way you don't have to add an otherwise bogus field to the index.

    For example:

    // Parse a query by the user.
    QueryParser qp = new QueryParser(Version.LUCENE_35, "text", new StandardAnalyzer());
    Query standardQuery = qp.parse("User query may go here");
    
    // Make a query that matches everything, but has no boost.
    MatchAllDocsQuery matchAllDocsQuery = new MatchAllDocsQuery();
    matchAllDocsQuery.setBoost(0f);
    
    // Combine the queries.
    BooleanQuery boolQuery = new BooleanQuery();
    boolQuery.add(standardQuery, BooleanClause.Occur.SHOULD);
    boolQuery.add(matchAllDocsQuery, BooleanClause.Occur.SHOULD);
    
    // Now just pass it to the searcher.
    

    This should give you hits from standardQuery followed by the rest of the documents in the index.