I am writing a nutch plugin at fetch time. I am doing some analysis on the fetched webpages and the results are to be stored in hbase corresponding to the webpage. I am not sure how to add an extra field and how to write data to that field using nutch.
If You want to add Additional Fields While indexing in Solr ::
If the value of the additional fields fixed (Static), then you can use the Nutch's index-static plugin.
It allows you to add a number of fields with their contents.
Step 1:
You first need to enable index.static property in nutch-site.xml
Step 2:
Add index.static property
<property>
<name>index.static</name>
<value>first_field:value,second_field:value</value>
<description>
Used by plugin index-static to adds fields with static data at indexing time.
You can specify a comma-separated list of fieldname:fieldcontent per Nutch job.
Each fieldcontent can have multiple values separated by space, e.g.,
field1:value1.1 value1.2 value1.3,field2:value2.1 value2.2 ...
It can be useful when collections can't be created by URL patterns,
like in subcollection, but on a job-basis.
</description>
</property>
Step 3:
Add field definition in schema.xml
Step 4:
Enabled the index in plugin.includes
Or You can follow https://wiki.apache.org/nutch/WritingPluginExample-1.2 for Writing Plugin