Search code examples
hadoopapache-storm

Does apache storm allow the processing of volumunous files stored on HDFS?


Does apache storm allow the processing of volumunous files stored on HDFS ?

knowing that my goal is to have a real time response (seconds or miliseconds),

or apache storm is only dedicated to streaming process !!

Thank you


Solution

  • Storm is only for streaming (as opposed to batch processing), but if I'm understanding you correctly, you want to read files from HDFS and process them?

    The storm-hdfs module has a spout (topology data source). It might do what you want.

    https://github.com/apache/storm/tree/master/external/storm-hdfs#hdfs-spout