Search code examples

Apache Nutch 2.3.1 opic scoring filter not working

I have configured Nutch 2.3.1 with complete Hadoop/Hbase ecosystem on a small cluster. I am curious about scoring algorithm used in Nutch. I have found and used opic scoring filter in Nutch. To find its impect, I have check score at different steps in Nutch IN ( dbupdate and generate phase) as guided in Nutch WIKI. But I have found that every document score always remain zero no matter how may iteration I run and how many documents I fetch. Is there some problem in opic implementation or I am missing some of its configuration.

I have observed that _csh_ field that contains cash is removed at fetcher phase from corresponding table in Hbase.


  • I had resolved it by putting the changes in


    I've put it in Markers as UTF8.

    -    row.getMetadata().put(CASH_KEY, ByteBuffer.wrap(Bytes.toBytes(score)));
    +    row.getMarkers().put(CASH_KEY, new Utf8(Double.toString(score)));