Search code examples
parsinglogstashlogstash-grokgrok

Grok extracting data from matched pattern


I have this message as input:

Feb 18 04:35:46 xxxx zzzz-nginx_error 2016/02/18 04:35:39 [error] 28585#0: *3120 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: xx.xx.xx.xx, server: xxxxxx, request: "HEAD / HTTP/1.1", upstream: "fastcgi://unix:/var/run/default.sock:", host: "xxxxxx"

And I am parsing it with:

  grok {
match => {
    "message" => [
            "(?<logstamp>\h{3} \d{2} \d{2}:\d{2}:\d{2}) (?<hostname>[^\s]+) (?<source>[^\s]+) (?<ngxstamp>[^\s]+ [^\s]+) %{GREEDYDATA:log}"
         }
   }

Which is fine, but I also want to extract client: xx.xx.xx.xx while keeping it inside %{GREEDYDATA:log}.

I've tried

"(?<logstamp>\h{3} \d{2} \d{2}:\d{2}:\d{2}) (?<hostname>[^\s]+) (?<source>[^\s]+) (?<ngxstamp>[^\s]+ [^\s]+) %{DATA:log} (?<client>%{IP})%{GREEDYDATA:log}"

but this just breaks the output as:

log: [error] 28585#0: *3120 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client:, , server: xxxxxx, request: "HEAD / HTTP/1.1", upstream: "fastcgi://unix:/var/run/default.sock:", host: "xxxxxx"
client: xx.xx.xx.xx

(notice the IP is truncated from log)

Can I just extract the data I need or should I join them with something like:

  mutate {
replace => {
    "log" => "%{DATA:log} (?<client>%{IP})%{GREEDYDATA:log}"
           }
     }

?


Solution

  • I just realized the answer was staring me in the face. This is the pattern:

    "(?<logstamp>\h{3} \d{2} \d{2}:\d{2}:\d{2}) (?<hostname>[^\s]+) (?<source>[^\s]+) (?<ngxstamp>[^\s]+ [^\s]+) %{DATA:log} (?<client>%{IP})%{GREEDYDATA:log2}"
    

    And this is the join:

      mutate {
    replace => {
        "log" => "%{log} %{client}%{log2}"
               }
         }