I have a program that writes structured logs, and the following example applies:
{
"time": "time_val",
"log": "{
\"field1\": \"value1\",
\"field2\": \"value2\",
\"field3\": \"{
\"nested_field1\": \"value1\",
\"nested_field2\": \"value2\",
\"nested_field3\": \"value3\"
}\"
}"
}
I am using fluentd to tail the output of the container, and parse JSON messages, however, I would like to parse the nested structured logs, so they are flattened in the original message. For the example, I would want fluentd to eventually consider the message as:
{
"time": "time_val",
"field1": "value1",
"field2": "value2",
"nested_field1": "value1",
"nested_field2": "value2",
"nested_field3": "value3"
}
Is this something that can be done using fluentd configuration? Changing the original program behavior is not an option in my case.
You can use the parser filter plugin with its key_name, reserve_data, and remove_key_name_field.
Example:
<filter **>
@type parser
key_name field3
reserve_data true
remove_key_name_field true
<parse>
@type json
</parse>
</filter>
Here is the complete working example after making your JSON valid i.e.:
{"field1":"value1","field2":"value2","field3":"{\"nested_field1\":\"value1\",\"nested_field2\":\"value2\",\"nested_field3\":\"value3\"}"}
fluent-flatten-json.conf
<source>
@type forward
</source>
<filter **>
@type parser
key_name field3
reserve_data true
remove_key_name_field true
<parse>
@type json
</parse>
</filter>
<match **>
@type stdout
</match>
Run fluentd
:
fluentd -c ./fluent-flatten-json.conf
From another terminal, run fluent-cat
with input JSON:
fluent-cat test <<< '{"field1":"value1","field2":"value2","field3":"{\"nested_field1\":\"value1\",\"nested_field2\":\"value2\",\"nested_field3\":\"value3\"}"}'
Output in fluentd
logs:
{"field1":"value1","field2":"value2","nested_field1":"value1","nested_field2":"value2","nested_field3":"value3"}
Formatted output:
{
"field1": "value1",
"field2": "value2",
"nested_field1": "value1",
"nested_field2": "value2",
"nested_field3": "value3"
}
UPDATE
For a double-nested valid raw escaped JSON:
{"time":"time_val","log":"{\"field1\":\"value1\",\"field2\":\"value2\",\"field3\":\"{\\\"nested_field1\\\":\\\"nested_value1\\\",\\\"nested_field2\\\":\\\"nested_value2\\\",\\\"nested_field3\\\":\\\"nested_value3\\\"}\"}"}
The double-nested JSON in the question is not valid. I had to recreate it. See here.
The following should work:
<filter **>
@type parser
key_name log
reserve_data true
remove_key_name_field true
<parse>
@type json
</parse>
</filter>
<filter **>
@type parser
key_name field3
reserve_data true
remove_key_name_field true
<parse>
@type json
</parse>
</filter>