Search code examples
searchsolrproximity

Proximity Search using phrases in Solr


I use Solr's proximity search quite often to search for words within a specified range of each other, like so

"Government Spending" ~2

I was wondering is there a way to perform a proximity search using a phrase and a word or two phrases. Is this possible? If so what is the syntax?


Solution

  • This appears to be "somewhat" doable. Consider this text:

    This is more about traffic between Solr servers themselves 
    

    "more traffic between solr" ~2

    "more about between solr" ~2

    Even if you change the order it works:

    "more about solr between" ~2" ~2

    But too far apart and it stops working:

    "more about servers themselves" ~2

    I think if that doesn't work, it would probably not be TOO hard to make a custom request handler that does this. I think you might need to define a new syntax, prehaps something like ("phrase one" "phrase two") ~2. I would guess that if you are shingling, and you create a Lucene query where there is a token of just "phrase one" and another of "phrase two" that have a certain proximity, i think it will work. (of course you will need to actually make the lucene java call, you can't just hand the query over (read this http://lucene.apache.org/java/2_2_0/api/index.html)).