Search code examples
kqlazure-data-studioazure-sentinel

Parse string into property bag and loop through its keys to display its values in an extended column


I am ingesting a log that ends up putting a lot of good data in the "AdditionalExtensions" field. This is an example of the data:

Action=Allow;[email protected];SrcIpAddr=192.168.1.146;SrcPortNumber=64694

I am trying to figure out if there is a way to, once I split(AdditionalExtensions,";") to convert the string array that the split() function produces into a property bag, and then loop through its keys, sort of what in python would be:

for k,v in mydict:
  print(f'{k}= {v}')

but of course I would have to extend its key into for example, Action, User, SrcIpAddr, SrcPortNumber so I can ultimately end up with something like:

| extend Action = loopThrough1stIteminDict[v],
         User = loopThrough2ndIteminDict[v]
| project Action, User

And the end result be:

Action   User

Allow    [email protected]

if that's not possible, what's more efficient in KQL, to use this:

| extend parser = split(AdditionalExtensions,";")
| extend Action = extract("=(.*)",1,tostring(parser[0])),

or instead of using extract(), use substring(), but using indexof() to tell the substring I want to start at the index where the equal sign is, and go to the end of the string

There is about 30-40 fields in the AdditionalExtensions field that I am looking to do this thorough for an important log that I may need to go back in time a lot, I don't want a query that goes back 2 hours to stall, let alone have to go back 7 days or more, KQL fails a lot when going back in time, nothing like Splunk of course, but I am working on this product now.


Solution

  • Ideally, you'll adjust the source to write semi-structured data in a supported and standard format - JSON - instead of semicolon/equals-sign-separated key-value pairs. This would allow you to avoid inefficient query-time parsing of the raw data whenever you query it.

    Ignoring performance and efficiency, you could still achieve that kind of parsing as follows:

    datatable (s:string)
    [
        'Action=Allow;[email protected];SrcIpAddr=192.168.1.146;SrcPortNumber=64694',
        'Action=Deny;[email protected];SrcIpAddr=192.168.1.147;SrcPortNumber=64695',
        'Action=Allow;[email protected];SrcIpAddr=192.168.1.148;SrcPortNumber=64696'
    ]
    | mv-apply s = split(s, ";") on (
        parse s with key "=" value
        | summarize b = make_bag(pack(key, value))
    )
    | evaluate bag_unpack(b)
    
    Action SrcIpAddr SrcPortNumber User
    Allow 192.168.1.146 64694 [email protected]
    Deny 192.168.1.147 64695 [email protected]
    Allow 192.168.1.148 64696 [email protected]