Search code examples
solrhbaseclouderainverted-index

Create indexes in solr on top of HBase


Is there anyway in which I can create indexes in Solr to perform full-text search from HBase for Near Real Time.

I didn't wanted to store the whole text in my solr indexes. Made "stored=false"

Note: - Keeping in mind, I am working on large datasets and want to do Near Real Time search. WE are talking TB/PB of data.

UPDATED

Cloudera Distribution : 5.4.x is used with Cloudera Search components.

Solr : 4.10.x

HBase : 1.0.x

Indexer Service : Lily HBase Indexer with cloudera morphlines

Is there any other NRT Indexer services or frameworks which can be used instead of Lily on Cloudera. Just a thought.


Solution

  • Cloudera : please check this article and Hbase-Solr using Cloudera-search which describes how to achieve that. see below screen shot as described by those articles. Bird view of hbase solr integration Have a look at known issues with Cloudera Search

    Yes you can consider Morphlines. they can be used for near real-time applications as well as batch processing applications.

    I don't know much about hortonworks platform and how this can be achieved.