Search code examples
solrsolrj

Is it possible to use multiple words in a filter query in SOLRJ / SOLR?


I am using SOLRJ (with SOLR 7) and my index features some fields for the document contents named content_eng, content_ita, ... It also features a field with the full path to the document (processed by a StandardTokenizer and a WordDelimiterGraphFilter).

The user is able to search in the content_xyz fields thanks to the lines : final SolrQuery query = new SolrQuery(); query.setQuery(searchedText); query.set("qf",searchFields); // searchFields is a generated String which looks like "content_eng content_ita" (field names separated by space)

Now the user needs to be able to specify some words contained in the path (namely some subdirectories). So I added a filterQuery :

query.addFilterQuery(
                "full_path_split:" + searchedPath);

If searchedPath contains only a single word contained in the document path, the document is correctly returned however if searchedPath has several words contained in the path, the document is not returned. To sum it up the fq only works if searchedPath contains a single word.

For example doc1 is in /home/user/dir1/doc1.txt

If I search for all (* in searchedText) documents that are in user dir (fq=full_path_split%3Adir) doc1.txt is returned.

If I do the same search but for documents that are in user and dir1 (fq=full_path_split%3user+dir1) doc1.txt is not returned, and I think it is because the fq is parsed as "+full_path_split:user +text:dir1" as debug=query shows. I don't know where text comes from it may be a default field.

So is it possible to use a filter query with several words to fulfill my needs ?

Any help appreciated,


Solution

  • Your suspicion is correct - the _text_:dir1 part comes from you not providing a field name, and the default field name being used instead.

    You can work around this by using the more general edismax (or the older dismax) parser as you're doing in your main query with qf:

    fq={!type=edismax qf='full_path_split'}user dir1