I intend to make a niche search engine. I am using apache-nutch-1.6 as the crawler and apache-solr-3.6.2 as the searcher. I must say there is very less updated information on web about these technologies.
I followed this tutorial http://wiki.apache.org/nutch/NutchTutorial and have successfully installed apache and solr on my ubuntu system. I was also successful in injecting seed url to webdb and perform the crawl.
Using solr interface at http://localhost:8983/solr/admin
, I can also query the crawled results. But this is the output I receive. .
Am I missing something here, the earlier apache-nutch-0.7 had a war which generated a clear html output like this. . How do I achieve this... Or if anyone could point me to a latest tutorial or guidebook, highly appreciated.
I found below link http://cmusphinx.sourceforge.net/2012/06/building-a-java-application-with-apache-nutch-and-solr/ which answered my query.
I agree after reading the content available on above link, I felt very angry at me. Solr package provides all the required objects to query solr.
Infact, the essential jars are just solr-solrj-3.4.0.jar, commons-httpclient-3.1.jar and slf4j-api-1.6.4.jar.
Anyone can build a java search engine using these objects to query the database and have a fancy UI.
Thanks again.