Search code examples
hbasehadoop2flume

Flume - how to read logs on regular interval or when reaches some size


I would like to know, if its possible to configure Flume to read logs on regular interval of time or when the logs reaches certain size and How? Thanks in advance


Solution

  • Flume to read logs regular interval of time or when the logs reaches certain size and How?

    Agents will be continuously running this is feature of flume. So in any interval logs or messages are coming... flume is able to capture those.

    if you want to check the size of the incoming messages or logs you have to write FlumeSource i.e. (public class FlumeSource extends AbstractSource implements Configurable, EventDrivenSource) , which will capture the log at source and prints the size. you can use

    LOG.info("Processing message...with size = " + FileUtils.byteCountToDisplaySize(bytes.length));
    

    where FileUtils is apache commons class & byteCountToDisplaySize is for displaying human readable way.

    To serialze eventlog to hbase you can to write custom serializer with AsyncHbase api.