Search code examples
elasticsearchlogstashkibanalogstash-grok

Extract Parameter (sub-string) from URL GROK Pattern


I have ELK running for log analysis. I have everything working. There are just a few tweaks I would like to make. To all the ES/ELK Gods in stackoverflow, I'd appreciate any help on this. I'd gladly buy you a cup of coffee! :D

Example:

URL: /origina-www.domain.com/this/is/a/path?page=2

First I would like to get the entire path as seen above.

Second, I would like to get just the path before the parameter: /origina-www.domain.com/this/is/a/path

Third, I would like to get just the parameter: ?page=2

Fourth, I would like to make the timestamp on the logfile be the main time stamp on kibana. Currently, the timestamp kibana is showing is the date and time the ES was processed.

This is what a sample entry looks like:

2016-10-19 23:57:32 192.168.0.1 GET /origin-www.example.com/url 200 1144 0 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" "-" "-"

Here's my config:

if [type] == "syslog" {
    grok {
      match => ["message", "%{IP:client}\s+%{WORD:method}\s+%{URIPATHPARAM:request}\s+%{NUMBER:bytes}\s+%{NUMBER:duration}\s+%{USER-AGENT}\s+%{QS:referrer}\s+%{QS:agent}%{GREEDYDATA}"]
          }
    date {
      match => [ "timestamp", "MMM dd, yyyy HH:mm:ss a" ]
      locale => "en"
    }   
}

ES Version: 5.0.1 Logstash Version: 5.0 Kibana: 5.0

UPDATE: I was actually able to solve it by using:

grok {
          match => ["message", "%{IP:client}\s+%{WORD:method}\s+%{URIPATHPARAM:request}\s+%{NUMBER:bytes}\s+%{NUMBER:duration}\s+%{USER-AGENT}\s+%{QS:referrer}\s+%{QS:agent}%{GREEDYDATA}"]
        }
        grok {
            match => [ "request", "%{GREEDYDATA:uri_path}\?%{GREEDYDATA:uri_query}" ]
        }

        kv {
            source => "uri_query"
            field_split => "&"
            target => "query"
        }

Solution

  • In order to use the actual timestamp of your log entry rather than the indexed time, you could use the date and mutate plugins as such to override the existing timestamp value. You could have your logstash filter look, something like this:

           //filtering your log file
            grok {
                    patterns_dir => ["/pathto/patterns"] <--- you could have a pattern file with such expression LOGTIMESTAMP %{YEAR}%{MONTHNUM}%{MONTHDAY} %{TIME} if you have to change the timestamp format.
                    match => { "message" => "^%{LOGTIMESTAMP:logtimestamp}%{GREEDYDATA}" }          
            }
            //overriding the existing timestamp with the new field logtimestamp
            mutate {
                    add_field => { "timestamp" => "%{logtimestamp}" }
                    remove_field => ["logtimestamp"]
            }
            //inserting the timestamp as UTC
            date   {
                    match => [ "timestamp" , "ISO8601" , "yyyyMMdd HH:mm:ss.SSS" ]
                    target => "timestamp"
                    locale => "en"
                    timezone => "UTC"
            }
    

    You could follow up Question for more as well. Hope it helps.