Search code examples
jsonregexparsingfluentd

how to parse log with json object using fluentd


I have a nginx log file which has lines like the following: 127.0.0.1 192.168.0.1 - [28/Feb/2013:12:00:00 +0900] "GET / HTTP/1.1" 200 777 "-" "Opera/12.0" - <{"key1":"value1","key2":98765,"key3":false,"key4":["one","two"],"key5":{"key22":98765,"key23":false,"key24":["one-one","two-two"]}}> As you can see the last value is a JSON object. Now I need to parse in the following format

1362020400 (28/Feb/2013:12:00:00 +0900)

record:
{
  "remote"              : "127.0.0.1",
  "host"                : "192.168.0.1",
  "user"                : "-",
  "method"              : "GET",
  "path"                : "/",
  "code"                : "200",
  "size"                : "777",
  "referer"             : "-",
  "agent"               : "Opera/12.0",
  "http_x_forwarded_for": "-",
  "myobject"            :{
                          "key1": "value1",
                          "key2": 98765,
                          "key3": false,
                          "key4": [
                            "one",
                            "two"
                          ],
                          "key5": {
                            "key22": 98765,
                            "key23": false,
                            "key24": [
                              "one-one",
                              "two-two"
                            ]
                          }
                         }

I can use the format like this:

expression /^(?<remote>[^ ]*) (?<host>[^ ]*) (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^\"]*?)(?: +\S*)?)?" (?<code>[^ ]*) (?<size>[^ ]*)(?: "(?<referer>[^\"]*)" "(?<agent>[^\"]*)"(?:\s+(?<http_x_forwarded_for>[^ ]+))?) \<(?<myobject>[^\>]*)\>?$/
time_format %d/%b/%Y:%H:%M:%S %z

But that parse the last JSON object to a string. How to keep that value as a JSON?


Solution

  • You can use filter/parser to parse from string to json object

    <filter foo.bar>
      @type parser
      key_name myobject
      reserve_data true
      remove_key_name_field true
      hash_value_field parsed
      <parse>
        @type json
      </parse>
    </filter>
    

    This is the example

    # input data:  {"host":"192.168.0.1", "myobject":"{\"key1\":1,\"key2\":2}"}
    # output data: {"host":"192.168.0.1", "parsed":{"key1":1,"key2":2}}