Search code examples
parsingflvweb-crawlernutch

how to extract contents from flv file using any web crawler?


My requirement is to extract text and audio from a flv file. please suggest me how can i achieve this using any web crawler. if it is not possible with web crawler please suggest me any other tool.

Thankyou


Solution

  • Using Nutch you can parse and extract metadata from the FLV file. If the text has been added into the file as part of the metadata you can retrieve it with Nutch and put it into a database.

    But you probably should be looking at a combination of wget (to download the content) + "a FLV stream extraction tool" to achieve what you require.

    Nutch

    Wget

    FLV metadata