Search code examples
kqlazure-log-analyticsazure-log-analytics-workspace

What is a straightforward way to parse hierarchical namespaces for filenames in Azure Storage Uri's using KQL?


I'm trying to create a Log Analytics Workbook for some Azure Blob Storage accounts. The accounts are ADLS Gen2 w/ Hierarchical Namespaces enabled. The issue is that some files are uploaded directly under the Container-level, some are uploaded inside subdirectories or subsubdirectories (Hierarchical Namespaces).

This makes it rather challenging to surface filenames using KQL.

Is there a simple way to grab each subdirectory if they exist and each filename or do I have to come up with a custom method for each blob container?

Example:

// Define variables
let varStartTime            = todatetime('2022-10-02T18:00:00Z');
let varEndTime              = todatetime('2022-10-02T20:00:00Z');
let varAccountName          = 'stgtest';
//
StorageBlobLogs
| where
    AccountName == varAccountName
    and TimeGenerated       between (varStartTime .. varEndTime)
| extend FileName           = todynamic(split(Uri, '/')) // herein lies the issue
| project
    TimeGenerated,
    AccountName,
    OperationName,
    FileName
| sort by TimeGenerated asc

Results:

enter image description here


Solution

  • parse_url
    parse_path

    StorageBlobLogs
    | sample 15
    | project Uri
    | extend Path = parse_path(tostring(parse_url(Uri).Path))
    | evaluate bag_unpack(Path, "Path_") 
    

    Azure Log Analytics | Demo Environment

    Results