Search code examples
logstashelastic-stacklogstash-grok

ELK Grok pattern - variable number of parameters for nginx error log


I'm trying to set up a GROK pattern to capture nginx error logs, but the contents are variable. For example, if there is no referrer, it simply leaves that value off the end of the line:

2018/08/30 09:30:32 [error] 84843#0: *24414687217 open() "/www/sites/js/draw.js" failed (2: No such file or directory), client: 172.68.211.134, server: www.example.com, request: "GET /bundles/app/js/draw.js HTTP/1.1", host: "www.example.com"

But if there is one, it adds:

, referrer: "https://www.example.com/de/member/foo"

My current GROK pattern works for lines with referrer, but how might I have it handle both examples?

%{DATA:nginx_error.time} \[%{DATA:nginx_error.level}\] %{NUMBER:nginx_error.pid}#%{NUMBER:nginx_error.tid}: (\*%{NUMBER:nginx_error.connection_id} )?%{GREEDYDATA:nginx_error.message}, client: %{IP:nginx_error.client}, server: %{HOSTNAME:nginx_error.server}, request: \"%{DATA:nginx_error.request}\", host: \"%{HOSTNAME:nginx_error.host}\", referrer: \"%{URI:nginx_error.referrer}\"

Solution

  • You can make referrer optional using ?, something like, (, referrer: )?(\"%{URI:referrer}\")?

    Please note that data enclosed in parentheses (...) is called a capturing group.

    Your pattern will then become,

    %{DATA:nginx_error.time} \[%{DATA:nginx_error.level}\] %{NUMBER:nginx_error.pid}#%{NUMBER:nginx_error.tid}: (\*%{NUMBER:nginx_error.connection_id} )?%{GREEDYDATA:nginx_error.message}, client: %{IP:nginx_error.client}, server: %{HOSTNAME:nginx_error.server}, request: \"%{DATA:nginx_error.request}\", host: \"%{HOSTNAME:nginx_error.host}\"(, referrer: )?(\"%{URI:referrer}\")?