Search code examples
indexingsolrhbase

HBase-indexer & Solr : NOT found data


I am currently using hbase-indexer to index hbase in solr. When I execute foolowing command to check the indexer,

hbase-indexer$ bin/hbase-indexer list-indexers --zookeeper 127.0.0.1:2181

The result is said that:

myindexer
+ Lifecycle state: ACTIVE 
+ Incremental indexing state: SUBSCRIBE_AND_CONSUME
+ Batch indexing state: INACTIVE
+ SEP subscription ID: Indexer_myindexer
+ SEP subscription timestamp: 2017-01-24T13:15:48.614+09:00
+ Connection type: solr
+ Connection params:
  + solr.zk = localhost:2181/solr
  + solr.collection = tagcollect
+ Indexer config:
    222 bytes, use -dump to see content
+ Indexer component factory:     
com.ngdata.hbaseindexer.conf.DefaultIndexerComponentFactory
+ Additional batch index CLI arguments:
  (none)
+ Default additional batch index CLI arguments:
  (none)
+ Processes
  + 1 running processes
  + 0 failed processes

I think hbase-indexer works well as shown above, because it is displayed as + 1 running processes.(Prior to this, I've already executed hbase-indexer daemon by the command : ~$ bin/hbase-indexer server )

For test, I've insert data in Hbase through put command and checked the data was inserted.

But, solr qry said following that: (No Record)

I wish your knowledge and experience associated with this to be shared. Thank you.

{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":7,
"params":{
  "q":"*:*",
  "indent":"on",
  "wt":"json",
  "_":"1485246329559"}},
"response":{"numFound":0,"start":0,"maxScore":0.0,"docs":[]
}}

Solution

  • We encountered same issue.

    As You are saying sever instance has good health, below are reasons which it wont work.

    • Firstly, If 'Write ahead log'(WAL) is disabled (may be for write performance reasons) then your puts wont create solr documents.

    Hbase NRT indexer works on WAL. if its disabled then it wont create solr documents.

    • Second reason may be mophiline configurations if they are not correct then it wont create solr documents

    However, I'd suggest to write a custom mapreduce programs(or spark jobs as well) to index solr documents by reading hbase data (if not Real time, that means when ever your put data in to hbase immeditely it wont reflect, after mapreduce solr indexer runs solr documents will be created)