Following log:
Jul 25 07:45:12 tst-proxy202 haproxy[1104]: 10.64.111.222:36635 [25/Jul/2016:07:45:12.479] promocloud~ promocloud/tst-service-proxy203 32/0/1/27/60 200 664 - - ---- 0/0/0/0/0 0/0 {} {} "POST /RTI HTTP/1.1"
Is parsed with ${HAPROXYHTTP} grok pattern
%{SYSLOGTIMESTAMP:syslog_timestamp} %{IPORHOST:syslog_server} %{SYSLOGPROG}: %{IP:client_ip}:%{INT:client_port} \[%{HAPROXYDATE:accept_date}\] %{NOTSPACE:frontend_name} %{NOTSPACE:backend_name}/%{NOTSPACE:server_name} %{INT:time_request}/%{INT:time_queue}/%{INT:time_backend_connect}/%{INT:time_backend_response}/%{NOTSPACE:time_duration} %{INT:http_status_code} %{NOTSPACE:bytes_read} %{DATA:captured_request_cookie} %{DATA:captured_response_cookie} %{NOTSPACE:termination_state} %{INT:actconn}/%{INT:feconn}/%{INT:beconn}/%{INT:srvconn}/%{NOTSPACE:retries} %{INT:srv_queue}/%{INT:backend_queue} (\{%{HAPROXYCAPTUREDREQUESTHEADERS}\})?( )?(\{%{HAPROXYCAPTUREDRESPONSEHEADERS}\})?( )?"(<BADREQ>|(%{WORD:http_verb} (%{URIPROTO:http_proto}://)?(?:%{USER:http_user}(?::[^@]*)?@)?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?"
This works well, up to some unexpected null in the syslog_server in a HOSTNAME section
"syslog_server": [
[
"tst-proxy202"
]
],
"HOSTNAME": [
[
"tst-proxy202",
null <<<<<<<<<
]
],
"IP": [
[
null,
null
]
],
"IPV6": [
[
null,
null,
null
]
],
"IPV4": [
[
null,
"10.64.111.222",
null
]
],
I did parse this with https://grokdebug.herokuapp.com/ and the patterns IPORHOST, and the IPORHOST https://grokdebug.herokuapp.com/patterns# works well against the hostname
tst-proxy202
%{IPORHOST:syslog_server}
{
"syslog_server": [
[
"tst-proxy202"
]
],
"HOSTNAME": [
[
"tst-proxy202"
]
],
"IP": [
[
null
]
],
"IPV6": [
[
null
]
],
"IPV4": [
[
null
]
]
}
Any idea what might be the problem?
If I understood you correctly you are trying to get rid of that null value. Well, the null value occurs because of the last part of the HAPROXYHTTP pattern (where it says ?(?:%{URIHOST:http_host})?(?:%{URIPATHPARAM:http_request})?( HTTP/%{NUMBER:http_version})?))?"
). It somehow adds an empty HOSTNAME. Luckily, this is not a serious problem and here is why:
The default options of the grok filter include named_captures_only => true
(docs) and keep_empty_captures => false
(docs). Try these two options in the grok debugger and your output looks pretty clean. In logstash you don't have to change anything.
If logstash misinterprets your hostname try to retrieve it from the grok values yourself (e.g. use the mutate filter):
filter {
mutate {
replace => { "HOSTNAME" => "%{syslog_server}" }
}
}
Please let me know if you have further problems.