Search code examples
regexlogstash-grokgrokgraylog

Separate IPV4 and IPV6 addresses with Regular Expressions and Grok


I'm trying to build a Grok pattern for some log files coming in. I have a field in a log message that can look like both of the following:

IP Address: (192.168.1.100),
IP Address: (192.168.1.100, 2001:0db8:85a3:0000:0000:8a2e:0370:7334),

Dealing with the first example is pretty straight forward I defined a new IP pattern called IP2 to escape out of the parentheses \((?:%{IP})\) and end up with something like:

Example Core RegEx Patterns:
IPV6 ((([0-9A-Fa-f]{1,4}:){7}([0-9A-Fa-f]{1,4}|:))|(([0-9A-Fa-f]{1,4}:){6}(:[0-9A-Fa-f]{1,4}|((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){5}(((:[0-9A-Fa-f]{1,4}){1,2})|:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3})|:))|(([0-9A-Fa-f]{1,4}:){4}(((:[0-9A-Fa-f]{1,4}){1,3})|((:[0-9A-Fa-f]{1,4})?:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){3}(((:[0-9A-Fa-f]{1,4}){1,4})|((:[0-9A-Fa-f]{1,4}){0,2}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){2}(((:[0-9A-Fa-f]{1,4}){1,5})|((:[0-9A-Fa-f]{1,4}){0,3}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(([0-9A-Fa-f]{1,4}:){1}(((:[0-9A-Fa-f]{1,4}){1,6})|((:[0-9A-Fa-f]{1,4}){0,4}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:))|(:(((:[0-9A-Fa-f]{1,4}){1,7})|((:[0-9A-Fa-f]{1,4}){0,5}:((25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)(\.(25[0-5]|2[0-4]\d|1\d\d|[1-9]?\d)){3}))|:)))(%.+)?
IPV4 (?<![0-9])(?:(?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5])[.](?:[0-1]?[0-9]{1,2}|2[0-4][0-9]|25[0-5]))(?![0-9])
IP (?:%{IPV6}|%{IPV4})
IP2 \((?:%{IP})\)

Grok Pattern for Field:
IP Address: %{IP2:ipv4_address},

I'm trying to figure out how to create a regular expressions pattern and Grok Pattern when both the IPv4 and IPv6 address show up. I'd be OK with always generating the IPv6 field and it just be empty if it's not present.


Solution

  • You need to use an optional group:

    \(%{IPV4:ipv4_address}(?:,\s*%{IPV6:ipv6_address})?\)
                          ^^^                        ^^ 
    

    Breakdown:

    • \( - an open (
    • %{IPV4:ipv4_address} - IPV4 pattern
    • (?: - start of an optional group that can occur 1 or 0 times
      • , - a comma
      • \s* - zero or more whitespaces
      • %{IPV6:ipv6_address} - IPV6 pattern
    • )? - end of the optional group (note that the ? is a quantifier that matches 1 or 0 occurrences of the quantified subpattern)
    • \) - a close ).