Search code examples
luceneproximity

Lucene Proximity Search with multiple words


I am trying to build a query to search an Lucene index of names with name variants. The index was built with Lucene.NET version 2.9.2

The user enters for example "Margaret White". Without a name variant option, my query becomes "Margaret White"~1 and it works.

Now, I can look up name variants against both firstname and surname to produce an extended list. eg. in this case (and I only include some as an example, since the list can be 100 or more sometimes) we can have

Margaret / Margrett White / Whyte

The query "margrett white"~1 OR "margaret white"~1 OR "margrett whyte"~1 OR "margaret whyte"~1

gives me the correct result but given a possible 100 x 100 variant combinations, the query string woudl be cumbersome to say the least.

I have tried various ways to achieve a more compact query but nothing seems to work.

Can anyone give me any pointers or alternative approach. I have control over the index creation process and wonder if there is something I can do at that stage?

Thanks for looking Roger


Solution

  • Do the synonym filter in your indexing process instead of at query time. Just map "white", "whyte", ... to some single word; say "white". Same for "Margaret."

    Then your query will just be "margaret white"~1