Fetch particular tags from HTML docs obtained after crawling and parsing using Apache Nutch 1.4

I used nutch 1.4 and crawled a website. I got the website crawled successfully and all the pages were dumped into segments. I merged all the segments to one segment and then i used readseg command to obtain a text version of all the crawled pages. Now I need to find out, URL of page and the meta data stored in that page. I don't know which command to use or shall i need to do something different.

Have made a lot of efforts on google Some people said that you have to write a separate plugin for it. Can someone tell me please.

Thanks a lot :) :)

Solution

Finally, I am able to do it. Sharing in case someone else needs it. You can use index-metatags plugin provided here: http://wiki.apache.org/nutch/IndexMetatags

It will solve this problem Cheers :)

'wsimport' is not recognized error in command prompt
Best way to compare two JSON files in Java
Java get month sort name from date
Obtain and download Javadoc (JDK API documentation) to a local file for offline reading
How to get the number of days in a specific month using Java Calendar?
Custom Spring annotation for request parameters
License for package Android SDK Platform 29 not accepted
Java Compile Time Error: reached end of file while parsing
ShellIpcClient and NonCelloThread errors java
How to verify a signature from the Phantom wallet?
FirebaseAuth - Get tokenId in Java backend
How to hide constructor on a Java record that offers a public static factory method?
Is it possible to get MariaDB4J to work on an M1 Mac?
Cannot run simple compiled java program?
Getting IntelliJ to generate Java Sources from Proto files
Insert a java string constant in a quarkus qute template?
Why is the run button not working in Eclipse?
Stuck on Card/Deck exercise from Java official tutorial
Spring Batch - Deleting metadata post job completion throws error - Incorrect result size: expected 1, actual 0
Simple export and import of a SQLite database on Android
How to serialize a date to a specific format?
How to make the Youtube's rotating spinner loading screen on Java Swing
How to sort List<Integer[]> in java?
How to prevent spring boot from auto creating instance of bean 'entityManagerFactory' at startup?
Sharing instance of a class between multiple tests running in parallel in Junit5
Launch4J not recognizing Eclipse Temurin OpenJDK Java 17
Turn my stack into a string?
How can I document or exclude the generated BuildConfig class in my documentation?
Is it a bad practice to catch Throwable?
Java: Right Click Copy Cut Paste On TextField