I'm new to logstash and would like to see if anyone could help with parsing my application log, which looks like this (Changed the api key with random characters):
2019-07-17 16:57:20,522 : braze INFO: Body: {"attributes": [{"external_id": "vT9fswqW", "email": "[email protected]", "site": "site"}], "api_key": "fg09831e-9re0-tc19-81c6-08934539f0vx2", "events": [{"properties": {"site": "site"}, "external_id": "vT9fswqW", "name": "REGISTER", "time": "2019-07-17'+0'16:57:20.522380"}]}
This log goes to logstash where filters can be applied before storing the log in ES. I would like to sanitize this log to hide certain information, like the email, external_id and api_key, so the final output from logstash would be something like:
2019-07-17 16:57:20,522 : braze INFO: Body: {"attributes": [{"external_id": "****", "email": "****", "site": "site"}], "api_key": "fg09831e-****", "events": [{"properties": {"site": "site"}, "external_id": "****", "name": "REGISTER", "time": "2019-07-17'+0'16:57:20.522380"}]}
The part starting with {"attributes"...
is a valid JSON, so I was thinking if I can store that part of the log in a separate field, I could then apply the json
logstash filter and then mutate the fields. I'm trying to separate the log to get the json only, but all my attempts using grok are failing. Any ideas how can I make it work?
You could do it using
grok { match => { "message" => "Body: %{GREEDYDATA:[@metadata][json]}" } }
json {
source => "[@metadata][json]"
remove_field => [ "[api_key]", "[attributes][0][email]", "[attributes][0][external_id]", "[events][0][external_id]" ]
}
To parse the first part of the message I would use dissect rather than grok.