Search code examples
logstash-grokgrok

How do you find a quoted string with specific word in a log message using grok pattern


I have a log message from my server with the format below:

{"host":"a.b.com","source_type":"ABCD"}

I have this grok pattern so far but it accepts any word in double quotation.

\A%{QUOTEDSTRING}:%{PROG}

how can I change "QUOTEDSTRING" that only check for "host"? "host" is not at the beginning of the message all the time and it can be found in the middle of message as well.

Thanks for your help.


Solution

  • Since the question specified that "host" can be anywhere between in the log, you can use the following:

    \{(\"%{GREEDYDATA:data_before}\",)?(\"host\":\"%{DATA:host_value}\")?(,\"%{GREEDYDATA:data_after}\")?\}
    

    Explanation :

    1. data_before stores the optional data before host type entry is found. You can separate it more as per your need
    2. host : this stores the host value
    3. data_after stores the optional data after host type entry is found. You can seaprate it more as per your need

    Example :

    1. {"host":"a.b.com","source_type":"ABCD"}

    Output :

    {
      "data_before": [
        [
          null
        ]
      ],
      "host_value": [
        [
          "a.b.com"
        ]
      ],
      "data_after": [
        [
          "source_type":"ABCD"
        ]
      ]
    }
    
    1. {"host":"a.b.com"}

    Output :

    {
      "data_before": [
        [
          null
        ]
      ],
      "host_value": [
        [
          "a.b.com"
        ]
      ],
      "data_after": [
        [
          null
        ]
      ]
    }
    
    1. {"source_type":"ABCD","host":"a.b.com","data_type":"ABCD"}

    Output :

    {
      "data_before": [
        [
          "source_type":"ABCD"
        ]
      ],
      "host_value": [
        [
          "a.b.com"
        ]
      ],
      "data_after": [
        [
          "data_type":"ABCD"
        ]
      ]
    }
    

    Tip : Use the following resources to tune and test your logging patterns :