I currently have formed a KQL that extracts ipv4 address from a string. similarly I would need to extract ipv6 address from the string
ipv4 extract query:
datatable (ipv4text:string)
[
'This is a random text that has IP address 128.0.0.20 that has to be extracted'
]
|extend pv4addr = extract("(([0-9]{1,3})\\.([0-9]{1,3})\\.([0-9]{1,3})\\.(([0-9]{1,3})))",1,ipv4text)
I tried the below but not sure if it covers all the edge cases
datatable (ipv6:string)
[
'IPv6 test 198.192.0.127 2345:5:2CA1:0000:0000:567:5673:256/127 in a random string'
]
|extend Ipv6Address = extract(@"(([0-9a-fA-F]{1,4}\:){7,7}[0-9a-fA-F]{1,4})|([0-9a-fA-F]{1,4}\:){1,7}\:",1,ipv6)
Can any of you one provide a complete KQL(or suggestions/hints) to extract IPV6 address?
Thanks.
The regex patterns can be simplified.
Below are the "happy paths". If it's there it will be extracted.
Theoretically you might get false positives, although less unlikely with a real-life data.
If needed, we can add some protection layers.
datatable (ipv4text:string)
[
'This is a random text that has IP address 128.0.0.20 that has to be extracted'
]
| project pv4addr = extract(@"(\d{1,3}\.){3}\d{1,3}", 0, ipv4text)
pv4addr |
---|
128.0.0.20 |
IPV6 can become a mess (see https://en.wikipedia.org/wiki/IPv6_address#Representation).
I would go with finding a full IPV6 representation (8 xdigit tokens, separated by colon) or any expression built of xdigit/colon/period that contains 2 adjacent colons.
datatable (ipv6:string)
[
'IPv6 test 198.192.0.127 2345:5:2CA1:0000:0000:567:5673:256/127 in a random string'
,'IPv6 test 198.192.0.127 2345:5:2CA1::567:5673:256/127 in a random string'
,'IPv6 test 198.192.0.127 ::ffff:198.192.0.127 in a random string'
,'IPv6 test 198.192.0.127 ::1 in a random string'
,'IPv6 test 198.192.0.127 :: in a random string'
]
| project pv6addr = extract(@"([[:xdigit:]]{1,4}:){7}[[:xdigit:]]{1,4}|[[:xdigit:]:.]*::[[:xdigit:]:.]*", 0, ipv6)
pv6addr |
---|
2345:5:2CA1:0000:0000:567:5673:256 |
2345:5:2CA1::567:5673:256 |
::ffff:198.192.0.127 |
::1 |
:: |