How do I extend this RegEx with CEF data with these two KeyValue pairs?

I currently have this regex:

?P<key>\w+)=(?P<value>[a-zA-Z0-9-_:/@. ]+

Input row 1: event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";

Input row 2: dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";

It correctly extracts the "event=1921;" but does extract the two other types.

How do I extract the "json={...}" using Key (JSON) and Value?
How do I extract "name":"gone" using Key (Name) and Value (gone)? The solution needs to be dynamic since key fields will be named differently in other rows.

Solution

you should be able to use the parse operator: https://learn.microsoft.com/en-us/azure/data-explorer/kusto/query/parseoperator

for example:

print input = 'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";'
| parse input with * "json=" json:dynamic ',"id"' * '"name":"' name '"' *

if your payload / property names are entirely dynamic, then:

a. I would recommend you evaluate your options to structure the source data in standard format (currently, even the "json" part of it isn't valid JSON)

b. you could try the following - functional, but very inefficient (not recommended for large scale data procesing)

datatable(input:string)
[
    'event=1921;json={"source":"A","location":B":"folder":"c:\\windows\\system32"},"id":2,"address":null,"name":"gone";',
    'dev=b;json={"dest":"123","home":AZ":"loc":"sys"},"ab":9,"home":null,"someKey":"someValue";'
]
| parse input with prefix ";json={" json:dynamic '},' suffix
| mv-apply x = extract_all(@'(\w+)=(\w+)', prefix) on (
    project p = pack(tostring(x[0]), x[1])
    | summarize b1 = make_bag(p)
)
| mv-apply y = extract_all(@'"(\w+)":"?(\w+)"?', suffix) on (
    project p = pack(tostring(y[0]), y[1])
    | summarize b2 = make_bag(p)
)
| project json = strcat("{", json, "}"), b = bag_merge(b1, b2)
| evaluate bag_unpack(b)