Search code examples
regexelasticsearchlogstashgroklogz.io

Using Grok filter out UUID from path string and return path


Here is an example line of the Log I am trying to filter :

Request starting HTTP/1.1 GET http://api0.api.sin/api/social/v1/owner/4b3b60f6-1a54-4fbc-87b5-cc44496a6dbf/feeds/notifications/unread/count

The result I am expecting is the following:

  {
  "message": [
    [
      "Request starting"
    ]
  ],
  "httpversion": [
    [
      "1.1"
    ]
  ],
  "BASE10NUM": [
    [
      "1.1"
    ]
  ],
  "verb": [
    [
      "GET"
    ]
  ],
  "request": [
    [
      "http://api0.api.sin/api/social/v1/owner/feeds/notifications/unread/count"
    ]
  ],
  "uuid": [
    [
      "4b3b60f6-1a54-4fbc-87b5-cc44496a6dbf"
    ]
  ]
}

I've tried using the following grok expression but the request is returned as 2 separate values.

%{DATA:message}(?: HTTP/%{NUMBER:httpversion}) %{WORD:verb} %{NOTSPACE:request}%{UUID:uuid}%{NOTSPACE:request}

Solution

  • You may capture the parts before and after UUID into separate groups, then you can combine the two values into one field:

    grok {
      match => {
        "message" => "%{DATA:message}(?: HTTP/%{NUMBER:httpversion}) %{WORD:verb} %{NOTSPACE:request1}/%{UUID:uuid}%{NOTSPACE:request2}"
      }
    }
    
    mutate {
      add_field => {
        "request" => "%{request1}%{request2}"
      }
    }
    

    You may drop request1 and request2 later if you wish, too.

    If you can't use mutate, you can only come up with an expression where request includes the UUID:

    %{DATA:message}(?: HTTP/%{NUMBER:httpversion}) %{WORD:verb} (?<request>.*?(?<UUID>[a-fA-F0-9]{8}(?:-[a-fA-F0-9]{4}){3}-[a-fA-F0-9]{12})\S*)
    

    because one can't match two disjoint strings of text into one capturing group.