Search code examples
azure-application-insightskql

In KQL how can I use bag_unpack to turn a serialized dictionary object in customDimensions into columns?


I'm trying to write a KQL query that will, among other things, display the contents of a serialized dictionary called Tags which has been added to the Application Insights traces table customDimensions column by application logging.

An example of the serialized Tags dictionary is:

{
    "Source": "SAP",
    "Destination": "TC",
    "SAPDeliveryNo": "0012345678",
    "PalletID": "(00)312340123456789012(02)21234987654(05)123456(06)1234567890"
}

I'd like to use evaluate bag_unpack(...) to evaluate the JSON and turn the keys into columns. We're likely to add more keys to the dictionary as the project develops and it would be handy not to have to explicitly list every column name in the query.

However, I'm already using project to reduce the number of other columns I display. How can I use both a project statement, to only display some of the other columns, and evaluate bag_unpack(...) to automatically unpack the Tags dictionary into columns?

Or is that not possible?

This is what I have so far, which doesn't work:

traces 
| where datetime_part("dayOfYear", timestamp) == datetime_part("dayOfYear", now()) 
    and message has "SendPalletData"
| extend TagsRaw = parse_json(customDimensions.["Tags"])
| evaluate bag_unpack(TagsRaw)
| project timestamp, message, ActionName = customDimensions.["ActionName"], TagsRaw
| order by timestamp desc

When it runs it displays only the columns listed in the project statement (including TagsRaw, so I know the Tags exist in customDimensions).

evaluate bag_unpack(TagsRaw) doesn't automatically add extra columns to the result set unpacked from the Tags in customDimensions.

EDIT: To clarify what I want to achieve, these are the columns I want to output:

timestamp message ActionName TagsRaw Source Destination SAPDeliveryNo PalletID

EDIT 2: It turned out a major part of my problem was that double quotes within the Tags data are being escaped. While the Tags as viewed in the Azure portal looked like normal JSON, and copied out as normal JSON, when I copied out the whole of a customDimensions record the Tags looked like "Tags": "{\"Source\":\"SAP\",\"Destination\":\"TC\", ... with the double quotes escaped with backslashes.

The accepted answer from David Markovitz handles this situation in the line:

TagsRaw     = todynamic(tostring(customDimensions["Tags"]))

Solution

  • A few comments:

    • When filtering on timestamp, better use the timestamp column As Is, and do the manipulations on the other side of the equation.

    • When using the has[...] operators, prefer the case-sensitive one (if feasable)

    • Everything extracted from dynamic value is also dynamic, and when given a dynamic value parse_json() (or its equivalent, todynamic()), simply returns it, As Is.
      Therefore, we need to treet customDimensions.["Tags"] in 2 steps:
      1st, convert it to string. 2nd, convert the result to dynamic.

    • To reference a field within a dynamic type you can use X.Y, X["Y"], or "X['Y'].
      No need to combine them as you did with customDimensions.["Tags"].

    • As the bag_unpack plugin doc states:

      "The specified input column (Column) is removed."

      In other words, TagsRaw does not exist following the bag_unpack operation.

    • Please note that you can add prefix to the columns generated by bag_unpack. Might make it easier to differentiate them from the rest of the columns.

    • While you can use project, using project-away is sometimes easier.

    // Data sample generation. Not part of the solution.
    let traces = 
    print    c1 = "some columns"
            ,c2 = "we"
            ,c3 = "don't need"
            ,timestamp          =   ago(now()%1d * rand())
            ,message            =   "abc SendPalletData xyz"
            ,customDimensions   =   dynamic
                                    (
                                        {
                                            "Tags":"{\"Source\":\"SAP\",\"Destination\":\"TC\",\"SAPDeliveryNo\":\"0012345678\",\"PalletID\":\"(00)312340123456789012(02)21234987654(05)123456(06)1234567890\"}"
                                           ,"ActionName":"Action1"
                                        }
                                    )
    ;
    // Solution starts here
    traces 
    | where         timestamp >= startofday(now())
                and message has_cs "SendPalletData"
    | extend        TagsRaw     = todynamic(tostring(customDimensions["Tags"]))
                   ,ActionName  = customDimensions.["ActionName"]
    | project-away  c*               
    | evaluate      bag_unpack(TagsRaw, "TR_")
    | order by      timestamp desc
    
    timestamp message ActionName TR_Destination TR_PalletID TR_SAPDeliveryNo TR_Source
    2022-08-27T04:15:07.9337681Z abc SendPalletData xyz Action1 TC (00)312340123456789012(02)21234987654(05)123456(06)1234567890 0012345678 SAP

    Fiddle