Search code examples
apache-nifilogstash-grok

How Can ExtractGrok use multiple regular expressions?


I have a Kakfa topic which includes different types of messages sent from different sources.

I would like to use the ExtractGrok processor to extract the message based on the regular expression/grok pattern.

How do I configure or run the processor with multiple regular expression?

For example, the Kafka topic contains INFO, WARNING and ERROR log entries from different applications.

I would like to separate the different log levels messages and place then into HDFS.


Solution

  • Instead of Using ExtractGrok processor, use Partition Record processor in NiFi to partition as this processor

    1. Evaluates one or more RecordPaths against the each record in the incoming FlowFile.

    2. Each record is then grouped with other "like records".

    3. Configure/enable controller services

      RecordReader as GrokReader

      Record writer as your desired format

    Then use PutHDFS processor to store the flowfile based on the loglevel attribute.

    Flow:

    1.ConsumeKafka processor
    2.Partition Record
    3.PutHDFS processor
    

    Refer to this link describes all the steps how to configure PartitionRecord processor.

    Refer to this link describes how to store partitions dynamically in HDFS directories using PutHDFS processor.