Search code examples
jsonelasticsearchopensearchgrok

How to filter only JSON from TEXT and JSON mixed format in logstash


We have input coming from one of the applications in TEXT + JSON format like the below:

<12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - {"event_type":"FilteredWebsites_Event","ipv4":"192.168.0.1","hostname":"9krkvs1","source_uuid":"11160173-r3bc-46cd-9f4e-99f66fc0a4eb","occured":"18-Oct-2022 10:48:37","severity":"Warning","event":"An attempt to connect to URL","target_address":"172.66.43.217","target_address_type":"IPv4","scanner_id":"HTTP filter","action_taken":"Blocked","handled":true,"object_uri":"https://free4pc.org","hash":"0E9ACB02118FBF52B28C3570D47D82AFB82EB58C","username":"CKFCVS1\\some.name","processname":"C:\\Users\\some.name\\AppData\\Local\\Programs\\Opera\\opera.exe","rule_id":"Blocked by internal blacklist"}

that is <12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - in TEXT and other in JSON.

The TEXT part is similar, only the date and time is different, so even if we delete all TEXT part it is okay.
The JSON part is random, but it contains useful information.

Currently, on Kibana, the logs are appearing in the message field, but the separate fields are not appearing because of improper JSON. So actually we tried to push ONLY the required JSON part by putting manually in the file gives us the required output in Kibana.

So our question is how to achieve this through logstash filters/grok.

Update:
@Val - We already have below configuration

input {
  syslog {
    port => 5044
    codec => json
  }
}

But the output on the Kibana is appearing as enter image description here

And we want it like: enter image description here


Solution

  • Even though syslog seems like an appealing way of shipping data, it is a big mess in terms of standardization and anyone has a different way of shipping data. The Logstash syslog input only supports RFC3164 and your log format doesn't match that standard.

    You can still bypass the normal RFC3164 parsing by providing your own grok pattern, as shown below:

    input {
      syslog {
        port => 5044
        grok_pattern => "<%{POSINT:priority_key}>%{POSINT:version} %{TIMESTAMP_ISO8601:timestamp} %{HOSTNAME:[observer][hostname]} %{WORD:[observer][name]} %{WORD:[process][id]} - - %{GREEDYDATA:[event][original]}"
      }
    }
    filter {
      json {
        source => "[event][original]"
      }
    }
    output {
       stdout { codec => json }
    }
    

    Running Logstash with the above config, your sample log line gets parsed as this:

    {
        "@timestamp": "2022-10-18T10:48:40.163Z",
        "@version": "1",
        "action_taken": "Blocked",
        "event": "An attempt to connect to URL",
        "event_type": "FilteredWebsites_Event",
        "facility": 0,
        "facility_label": "kernel",
        "handled": true,
        "hash": "0E9ACB02118FBF52B28C3570D47D82AFB82EB58C",
        "host": "0:0:0:0:0:0:0:1",
        "hostname": "9krkvs1",
        "ipv4": "192.168.0.1",
        "message": "<12>1 2022-10-18T10:48:40.163Z 7VLX5D8 ERAServer 14016 - - {\"event_type\":\"FilteredWebsites_Event\",\"ipv4\":\"192.168.0.1\",\"hostname\":\"9krkvs1\",\"source_uuid\":\"11160173-r3bc-46cd-9f4e-99f66fc0a4eb\",\"occured\":\"18-Oct-2022 10:48:37\",\"severity\":\"Warning\",\"event\":\"An attempt to connect to URL\",\"target_address\":\"172.66.43.217\",\"target_address_type\":\"IPv4\",\"scanner_id\":\"HTTP filter\",\"action_taken\":\"Blocked\",\"handled\":true,\"object_uri\":\"https://free4pc.org\",\"hash\":\"0E9ACB02118FBF52B28C3570D47D82AFB82EB58C\",\"username\":\"CKFCVS1\\\\some.name\",\"processname\":\"C:\\\\Users\\\\some.name\\\\AppData\\\\Local\\\\Programs\\\\Opera\\\\opera.exe\",\"rule_id\":\"Blocked by internal blacklist\"}\n",
        "object_uri": "https://free4pc.org",
        "observer": {
            "hostname": "7VLX5D8",
            "name": "ERAServer"
        },
        "occured": "18-Oct-2022 10:48:37",
        "priority": 0,
        "priority_key": "12",
        "process": {
            "id": "14016"
        },
        "processname": "C:\\Users\\some.name\\AppData\\Local\\Programs\\Opera\\opera.exe",
        "rule_id": "Blocked by internal blacklist",
        "scanner_id": "HTTP filter",
        "severity": "Warning",
        "severity_label": "Emergency",
        "source_uuid": "11160173-r3bc-46cd-9f4e-99f66fc0a4eb",
        "target_address": "172.66.43.217",
        "target_address_type": "IPv4",
        "timestamp": "2022-10-18T10:48:40.163Z",
        "username": "CKFCVS1\\some.name",
        "version": "1"
    }