Search code examples
jsonrubyfluentd

How to parse string that are not standardized JSON in ruby?


I am using fluentd to parse logs, which looks like:

{ date="2017-04-01 10:22:18.306", message="This is a trace Message!" }
{ date="2017-04-01 10:22:18.306", message="This is a debug message" }

While standardized JSON version is supposed to be:

{ "date":"2017-04-01 10:22:18.306", "message":"This is a trace Message!" }

I have tried

str='{ date="2017-04-01 10:22:18.306", message="This is a trace Message!" }'
Yajl::Parser.parse(str)

And it does not work:

Yajl::ParseError: lexical error: invalid char in json text.
                                     { date="2017-04-01 10:22:18.306",
                     (right here) ------^

    from /var/lib/gems/2.3.0/gems/yajl-ruby-1.2.1/lib/yajl.rb:37:in `parse'
    from /var/lib/gems/2.3.0/gems/yajl-ruby-1.2.1/lib/yajl.rb:37:in `parse'
    from (irb):45
    from /usr/bin/irb:11:in `<main>'

Solution

  • You could use scan with a regex:

    data = %q(
    { date="2017-04-01 10:22:18.306", message="This is a trace Message!" }
    { date="2017-04-01 10:22:18.306", message="This is a debug message" }
    )
    
    pattern = /date="([^"]+)", message="([^"]+)"/
    
    messages = data.scan(pattern).map{ |date, message|
      {date: date, message: message}
    } 
    
    p messages
    # [{:date=>"2017-04-01 10:22:18.306", :message=>"This is a trace Message!"}, {:date=>"2017-04-01 10:22:18.306", :message=>"This is a debug message"}]