I have a Kakfa topic which includes different types of messages sent from different sources.
I would like to use the ExtractGrok
processor to extract the message based on the regular expression/grok pattern.
How do I configure or run the processor with multiple regular expression?
For example, the Kafka topic contains INFO, WARNING and ERROR log entries from different applications.
I would like to separate the different log levels messages and place then into HDFS.
Instead of Using ExtractGrok
processor, use Partition Record processor in NiFi to partition as this processor
Evaluates one or more RecordPaths against the each record in the incoming FlowFile.
Each record is then grouped with other "like records".
Configure/enable controller services
RecordReader as GrokReader
Record writer as your desired format
Then use PutHDFS processor to store the flowfile based on the loglevel attribute.
Flow:
1.ConsumeKafka processor
2.Partition Record
3.PutHDFS processor
Refer to this link describes all the steps how to configure PartitionRecord processor.
Refer to this link describes how to store partitions dynamically in HDFS directories using PutHDFS processor.