Search code examples
javaluceneuima

data extraction by tagging using java


I have a requirement where i have collection of text files (unstructured data), based on the user input(tag), i need to search for the tag term in all the files. if found i need to return the paragraph where the search term occurred.

for example: spec.txt file having the following content

The ABX earphones with Bluetooth support have been rolled into the Indian market for a price of Rs 5490. They’re available in two color choices of black and red, and come with a rechargeable battery which can be juiced up via the supplied micro-USB cable.

The ABX is said to be capable of rendering up to 10.5 hours of playback once fully charged. It also features an integrated microphone that lets you attend to voice calls. The earphones come with digital noise cancellation technology and a Bluetooth receiver/connector.

in the above 2 paragraphs, if the user enters the tag, "price" it should return "price = Rs 5490" or it should return the paragraph where it identified the term "price"

i have checked UIMA and lucene, but not getting any idea how to do this, can anyone help me..

Thanks in advance


Solution

  • Thank u for ur reply... yeah i found the solution, i m using solr highlighter, by adjusting fragment size of the snippet returned by the solr response we can get the paragraph where the search term exists