I currently have this regex:
?P<key>\w+)=(?P<value>[a-zA-Z0-9-_:/@. ]+
Input row 1: event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";
Input row 2: dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";
It correctly extracts the "event=1921;" but does extract the two other types.
"json={...}"
using Key (JSON) and Value?"name":"gone"
using Key (Name) and Value (gone)?
The solution needs to be dynamic since key fields will be named differently in other rows.you should be able to use the parse
operator: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parseoperator
for example:
print input = 'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";'
| parse input with * "json=" json:dynamic ',"id"' * '"name":"' name '"' *
if your payload / property names are entirely dynamic, then:
a. I would recommend you evaluate your options to structure the source data in standard format (currently, even the "json
" part of it isn't valid JSON)
b. you could try the following - functional, but very inefficient (not recommended for large scale data procesing)
datatable(input:string)
[
'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";',
'dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";'
]
| parse input with prefix ";json={" json:dynamic '},' suffix
| mv-apply x = extract_all(@'(\w+)=(\w+)', prefix) on (
project p = pack(tostring(x[0]), x[1])
| summarize b1 = make_bag(p)
)
| mv-apply y = extract_all(@'"(\w+)":"?(\w+)"?', suffix) on (
project p = pack(tostring(y[0]), y[1])
| summarize b2 = make_bag(p)
)
| project json = strcat("{", json, "}"), b = bag_merge(b1, b2)
| evaluate bag_unpack(b)