Search code examples
javaapache-kafkahbaseapache-kafka-connect

Using Schemaless JSON converter for Hbase Connector Kafka


I'm using the hbase sink connector for kafka (https://github.com/mravi/kafka-connect-hbase). So I tried to implement this connector using its JsonConverter in event parser class like below.

{
  "name": "test-hbase",
  "config": {
    "connector.class": "io.svectors.hbase.sink.HBaseSinkConnector",
    "tasks.max": "1",
    "topics": "hbase_connect",
    "zookeeper.quorum": "xxxxx.xxxx.xx.xx,xxxxx.xxxx.xx.xx,xxxxx.xxxx.xx.xx",
    "event.parser.class": "io.svectors.hbase.parser.JsonEventParser",
    "hbase.hbase_connect.rowkey.columns": "id",
    "hbase.hbase_connect.family": "col1",
  }
}

And this is my run distributed properties of kafka connect :

key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.json.JsonConverter
key.converter.schemas.enable=false
value.converter.schemas.enable=false

internal.key.converter=org.apache.kafka.connect.json.JsonConverter
internal.value.converter=org.apache.kafka.connect.json.JsonConverter
internal.key.converter.schemas.enable=false
internal.value.converter.schemas.enable=false

The problem is when I'm trying to produce JSON message with no schema, the connector thrown null pointer exception below :

 [2018-12-10 16:45:06,607] ERROR WorkerSinkTask{id=hbase_connect-0}
 Task threw an uncaught and unrecoverable exception.
 Task is being killed and will not recover until manually restarted.
 (org.apache.kafka.connect.runtime.WorkerSinkTask:515)
 java.lang.NullPointerException
at io.svectors.hbase.util.ToPutFunction.apply(ToPutFunction.java:78)
at io.svectors.hbase.sink.HBaseSinkTask.lambda$4(HBaseSinkTask.java:105)
    at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
    at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at io.svectors.hbase.sink.HBaseSinkTask.lambda$3(HBaseSinkTask.java:105)
    at java.util.stream.Collectors.lambda$toMap$58(Collectors.java:1321)
    at java.util.stream.ReduceOps$3ReducingSink.accept(ReduceOps.java:169)
    at java.util.HashMap$EntrySpliterator.forEachRemaining(HashMap.java:1696)
    at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
    at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
    at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
    at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
    at io.svectors.hbase.sink.HBaseSinkTask.put(HBaseSinkTask.java:104)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:495)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:288)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748) [2018-12-10 16:45:06,607] ERROR WorkerSinkTask{id=hbase_connect-0} 
    Task threw an uncaught and unrecoverable exception (org.apache.kafka.connect.runtime.WorkerTask:172)
 org.apache.kafka.connect.errors.ConnectException: Exiting WorkerSinkTask due to unrecoverable exception.
at org.apache.kafka.connect.runtime.WorkerSinkTask.deliverMessages(WorkerSinkTask.java:517)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.poll(WorkerSinkTask.java:288)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.iteration(WorkerSinkTask.java:198)
    at org.apache.kafka.connect.runtime.WorkerSinkTask.execute(WorkerSinkTask.java:166)
    at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:170)
    at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:214)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)

This is the message that I use :

{"id": "9","name": "wis"}

Any suggestions for this error?


Solution

  • It was an issue with connector for schema less json .

    It has been fixed in the release here : https://github.com/nishutayal/kafka-connect-hbase/issues/5