I'm shipping Windows DNS debug logs via json into Elasticsearch and I need to parse them. As with Microsoft nothing is easy. The DNS debug log is not a CSV. The only useful thing in that file is that it has fixed lengths of columns.
Here is a sample of the DNS logs:
11/21/2014 5:59:13 PM 0458 PACKET 00000000039ED750 UDP Rcv 192.168.1.98 600c Q [0001 D NOERROR] A (9)grokdebug(9)herokuapp(3)com(0)
11/21/2014 5:59:13 PM 0458 PACKET 00000000039EF460 UDP Snd 192.168.1.1 e044 Q [0001 D NOERROR] A (9)grokdebug(9)herokuapp(3)com(0)
11/21/2014 5:59:13 PM 0458 PACKET 00000000039F85B0 UDP Rcv 192.168.1.1 e044 R Q [8081 DR NOERROR] A (9)grokdebug(9)herokuapp(3)com(0)
11/21/2014 5:59:13 PM 0458 PACKET 00000000039F85B0 UDP Snd 192.168.1.98 600c R Q [8081 DR NOERROR] A (9)grokdebug(9)herokuapp(3)com(0)
I looked at this Stackoverflow answer: Logstash grok filter help - fixed position file and was trying to set up a grok filter to parse the columns but it's not working for me. I understand I have a syntax issue but I can't seem to find a good example that would steer me in correct direction.
Here is my grok filter:
grok {
match => [ "message", "(?<dns_date_n_time>.{21}) (?<dns_field_1>.{5}) (?dns_type>.{8}) (?<dns_field_2>.{19}) (?<dns_protocol>.{4}) (?<dns_direction>.{4}) (?<dns_ip>.{16}) (?<dns_field_3>.{4}) (?<dns_query_type>.{5}) (?<dns_field_5>.{7}) (?<dns_field_6>.{3}) (?<dns_flag>.{9}) (?<dns_field_7>.{2}) (?<dns_record>.{5}) (?<dns_domain>.{255})" ]
}
Can anyone help?
Don't get hung up on the fact that the logfile happens to have a fixed-width format. It doesn't really help here since. Parse the file like it's any old logfile using relevant grok patterns. This works for the input you provided:
(?<timestamp>%{DATE_US} %{TIME} (?:AM|PM))\s+%{NUMBER}\s+%{WORD:dns_type}\s+
%{BASE16NUM}\s+%{WORD:dns_protocol}\s+%{WORD:dns_direction}\s+%{IP:dns_ip}\s+
%{BASE16NUM}\s+%{WORD:dns_query_type}\s+\[%{BASE16NUM}\s+%{WORD}\s+
%{WORD:dns_result}\]\s+%{WORD:dns_record}\s+%{GREEDYDATA:dns_domain}
That said, since I don't know what each column in the logfile means some patterns used here might be too sloppy or too strict. I've inserted linebreaks to make the answer more readable but make sure you concatenate thing correctly when you insert it into your configuration file.