Search code examples
regexlogstash-grok

Grok: how to match until first space (IIS Log)


I want to parse IIS logs. one of the possible inputs is like the following

2017-10-01 00:00:01 W3SVC2 xx xx.xx.xx.xx GET /CMSPages/PortalTemplate.aspx searchtext=excel-template-aa-xx-xx&xx=%2xx%2xx.com%yyyyy 443 - yy.yy.yy.yy HTTP/1.1 Mozilla/5.0+(compatible;+bingbot/2.0;++http://www.bing.com/bingbot.htm) - - www.yyyyy.com 410 0 64 0 335 32791

I can parse the above input until searchtext but I don't know how to get the search text?

%{TIMESTAMP_ISO8601:log_timestamp}%{SPACE}%{WORD:machine}%{SPACE}%{WORD:ServerName}%{SPACE}%{IPV4:serverIP}%{SPACE}%{WORD:method}%{SPACE}%{URIPATH:uriStem}%{SPACE}%{WORD:searchTextWord}

is there any way to check if the searchtext exists then get the following text (until first space) as a search text.


Solution

  • Code

    See regex in use here

    \bsearchtext=\S+
    

    For Gork, make it optional:

    (%{searchtext=\S+})?
    

    Results in the following match:

    searchtext=excel-template-aa-xx-xx&xx=%2xx%2xx.com%yyyyy


    Explanation

    • \b Assert position as a word boundary
    • searchtext Match this literally
    • \S+ Match any non-whitespace character one or more times