Search code examples
apachehadooptwitterflume

Can Apache Flume be used to extract tweets for a certain period of time?


I want to extract twitter feeds related to a keyword for the months of June and July using Apache Flume. Can this be done in the first place?


Solution

  • AFAIK, the TwitterSource from Cloudera is just for receiving data at the same time it is generated. I think something similiar occurs with the Twitter 1% firehose source.

    Nevertheless, I'm seeing the Twitter API may work with timelines, thus it is a matter of modifying the TwitterSource source code.