Search code examples
javasolrnlpopennlp

Semi Natural language Search using Apache Solr


I did some analysis on Apache Solr and its pretty good to search data from various sources. The problem I am facing is how do I standardize my search grammar and translate search text into Solr query.

I have three types of file/database table to search from - namely Customer, Industry and Unit. The first keyword in the search box should be any of the three. After that, the user can define a fix set of criteria:

Metrics : 0 or many (ex, exposure, income, revenue, loan_amt etc)
Dimension : 0 or many (Geography, region, etc)

Example:

customer - Returns all customer data from customer core
customer income from Asia - Returns all customer income details who belongs to Asia 
customer income revenue from Asia - Returns all customer income and revenue details who belongs to Asia 

How can I translate the above natural language search text to solr query? Can I fix my grammar of text in Solr like first keyword should be customer/industry/unit, second key-value would be one or more region/geography and then metric values.

I am not looking for google like search but a limited search where the user knows what to search.


Solution

  • This doesn't seem to be a Solr question, strictly speaking. As a first step, you might want to define a context-free grammar (CFG, type-2 grammar) based on specific production rules for your input. This would give you some solid syntax rules to work from. Based on this, you can then create a parser for the natural language input and map the resulting parse tree to the keyword search in Solr.