I have the following 10000 rows of log file every 5 seconds.
log_datetime1 host_name1 log_message1
log_datetime2 host_name2 log_message2
log_datetime3 host_name3 log_message3
I want to send them to kudu or parquet table as the following JSON
{"cureent_datetime":"datetime", "log_data":"log_datetime1 host_name1 log_message1"}
{"cureent_datetime":"datetime", "log_data":"log_datetime2 host_name2 log_message2"}
{"cureent_datetime":"datetime", "log_data":"log_datetime3 host_name3 log_message3"}
Currently I'm using Two ReplaceText Processors
. One to add the
{"cureent_datetime":"datetime", "log_data":"
at the beginning of each line of 10000 rows log file and the second one to add "}
at the end of each line.
Was wondering if I could do the both step in one ReplaceText proecssor
.
Using the search pattern (.+)(?=\n)
and the replacement pattern {"current_datetime":"datetime", "log_data":"$1"}
will result in the desired output. The search pattern looks for text which is followed by a newline, and the replacement includes the capture group inside the templated JSON structure.