I'm really liking ELK for parsing the logs. However, I'm stuck at a point where it needs to parse a list of dictionaries. Following are my logs:-
IP - - 0.000 0.000 [24/May/2015:06:51:13 +0000] *"POST /c.gif HTTP/1.1"* 200 4 * user_id=UserID&package_name=SomePackageName&model=Titanium+S202&country_code=in&android_id=AndroidID&eT=1432450271859&eTz=GMT%2B05%3A30&events=%5B%7B%22eV%22%3A%22com.olx.southasia%22%2C%22eC%22%3A%22appUpdate%22%2C%22eA%22%3A%22app_activated%22%2C%22eTz%22%3A%22GMT%2B05%3A30%22%2C%22eT%22%3A%221432386324909%22%2C%22eL%22%3A%22packageName%22%7D%5D * "-" "-" "-"
The URL decoded version of the above log is
IP - - 0.000 0.000 [24/May/2015:06:51:13 0000] *"POST /c.gif HTTP/1.1"* 200 4 * user_id=UserID&package_name=SomePackageName&model=Titanium S202&country_code=in&android_id=AndroidID&eT=1432450271859&eTz=GMT+05:30&events=[{"eV":"com.olx.southasia","eC":"appUpdate","eA":"app_activated","eTz":"GMT+05:30","eT":"1432386324909","eL":"packageName"}] * "-" "-" "-"
Wherever I try to parse it, it shows me _jsonparsefailure
. I've gone through this question as well, and has gone through various forums as well, but didnt find a perfect solution for the same. How can I parse a json list in logstash?? If there doesnt exist any till now, what can be the work around for the same.??
Following is my config file.
filter {
mutate {
gsub => [
"message", "\+", "%20"
]
}
urldecode{
field => "message"
}
grok {
match => [
'message', '%{IP:clientip}%{GREEDYDATA} \[%{GREEDYDATA:timestamp}\] \*"%{WORD:method}%{GREEDYDATA}'
]
}
kv {
field_split => "&?"
}
json{
source => "events"
}
geoip {
source => "clientip"
}
}
This question is an exact copy of Parse json in a list in logstash. Even with the same log entries?! Could anyone make sense of that?
You can see my answer there but I will sum it up for you... option e) is probably the best approach
Apparently you get a jsonparsefailure because of the square brackets. As a workaround you could manually remove them. Add the following mutate filter after your kv and before your json filter:
mutate {
gsub => [ "events","\]",""]
gsub => [ "events","\[",""]
}
However, that doesn't work for an input like [{"foo":"bar"},{"foo":"bar1"}]
. So here are 4 options:
Option a) ugly gsub
An ugly workaround would be another gsub:
gsub => [ "event","\},\{",","]
But this would remove the inner relations so I guess you don't want to do that.
Option b) split
A better approach might be to use the split filter:
split {
field => "event"
terminator => ","
}
mutate {
gsub => [ "event","\]",""]
gsub => [ "event","\[",""]
}
json{
source=> "event"
}
This would generate multiple events. (First with foo = bar
and second with foo1 = bar1
.)
Option c) mutate split
You might want to have all the values in one logstash event. You could use the mutate => split filter to generate an array and parse the json if an entry exists. Unfortunately you will have to set a conditional for each entry because logstash doesn't support loops in its config.
mutate {
gsub => [ "event","\]",""]
gsub => [ "event","\[",""]
split => [ "event", "," ]
}
json{
source=> "event[0]"
target => "result[0]"
}
if 'event[1]' {
json{
source=> "event[1]"
target => "result[1]"
}
if 'event[2]' {
json{
source=> "event[2]"
target => "result[2]"
}
}
# You would have to specify more conditionals if you expect even more dictionaries
}
Option d) Ruby1
Following works (after your kv filter): Rather use option e)
mutate {
gsub => [ "event","\]",""]
gsub => [ "event","\[",""]
}
ruby {
init => "require 'json'"
code => "
e = event['event'].split(',')
ary = Array.new
e.each do |x|
hash = JSON.parse(x)
hash.each do |key, value|
ary.push( { key => value } )
end
end
event['result'] = ary
"
}
Option e) Ruby2
After some testing this might be the best approach. Use this after your kv filter:
ruby {
init => "require 'json'"
code => "event['result'] = JSON.parse(event['event'])"
}