Search code examples
javahbasenutchgora

Nutch 2.2.1 + hBase


I am trying to run the new version of Apache Nutch for crawling. When I start script /bin/crawl, it fails and hadoop.log says:

java.lang.Exception: java.lang.NoSuchMethodError: org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema; at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354) Caused by: java.lang.NoSuchMethodError: org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema; at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:177)

Here is the log:

2013-07-04 16:12:05,069 WARN  mapred.LocalJobRunner - job_local1522971864_0001
java.lang.Exception: java.lang.NoSuchMethodError:     org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema;
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:354)
Caused by: java.lang.NoSuchMethodError:     org.apache.gora.persistency.Persistent.getSchema()Lorg/apache/avro/Schema;
at org.apache.gora.hbase.store.HBaseStore.put(HBaseStore.java:177)
at org.apache.gora.mapreduce.GoraRecordWriter.write(GoraRecordWriter.java:65)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.write(MapTask.java:638)
at org.apache.hadoop.mapreduce.TaskInputOutputContext.write(TaskInputOutputContext.java:80)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:191)
at org.apache.nutch.crawl.InjectorJob$UrlMapper.map(InjectorJob.java:88)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.LocalJobRunner$Job$MapTaskRunnable.run(LocalJobRunner.java:223)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
at java.util.concurrent.FutureTask.run(FutureTask.java:166)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

2013-07-04 16:12:05,720 ERROR crawl.InjectorJob - InjectorJob: java.lang.RuntimeException: job failed: name=[new]inject /opt/ir/nutch2/urls, jobid=job_local1522971864_0001
at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:233)
at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)

Should I set some gora artifacts inside ivy.xml or something? Please help me.


Solution

  • Solved. You must add correct version of gora-hbase to you libraries. gora-hbase-0.3.jar