Search code examples
azurekqlazure-resource-graph

KQL how able to slice the data in smaller chunks


I have this query but the output returns 106K records, what I need to add to the query do smaller chunks for data extract and export on CSV.

Resources
| where type == 'microsoft.compute/virtualmachines'
| extend
            JoinID = toupper(id),
            VMName = tostring(properties.osProfile.computerName),
            OSType = tostring(properties.storageProfile.osDisk.osType),
            VMSize = tostring(properties.hardwareProfile.vmSize),
OSVersion = tostring(properties.extended.instanceView.osVersion),
OSName = tostring(properties.extended.instanceView.osName)
| join kind=leftouter(
            Resources
            | where type == 'microsoft.compute/virtualmachines/extensions'
            | extend 
                            VMId = toupper(substring(id, 0, indexof(id, '/extensions'))),
                            ExtensionType = properties.type,
    ExtensionPublisher = properties.publisher
    ) on $left.JoinID == $right.VMId

does azure graph explorer or kql has any data export data limitation?

I have modified the query per suggestion and works for me.

   Resources
    | where type == 'microsoft.compute/virtualmachines'
    | extend
            JoinID = toupper(id),
            VMName = tostring(properties.osProfile.computerName),
            OSType = tostring(properties.storageProfile.osDisk.osType),
            VMSize = tostring(properties.hardwareProfile.vmSize),
    OSVersion = tostring(properties.extended.instanceView.osVersion),
    OSName = tostring(properties.extended.instanceView.osName)
    | join kind=leftouter(
           Resources
            | where type == 
    'microsoft.compute/virtualmachines/extensions'
            | extend 
                            VMId = toupper(substring(id, 0, indexof(id, 
     '/extensions'))),
                            ExtensionPublisher = properties.publisher,
                            Extension_Publisher  = 
      tostring(properties.publisher),
                            ExtensionType =  properties.type
      ) on $left.JoinID == $right.VMId
      | project-away  identity1, kind, managedBy, sku, identity, 
       zones,extendedLocation,extendedLocation1, kind1, managedBy1, 
      sku1, tags1, zones1, systemData1,tags,
      systemData, tenantId, JoinID, id1, id, VMName, location1, 
      tenantId1, OSType, type1, apiVersion1, VMId,  apiVersion, plan1,

ExtensionPublisher | sort by Extension_Publisher asc | extend row_number = row_number() | where row_number > 0 | take 200


Solution

  • You can extend a row number and filter by that field to paginate (offset), then you take (or limit) the amount of records you want.

    Resources
      ...
      | serialize 
      | extend row_number = row_number()
      | where row_number > 10000
      | take 5000
    

    This query returns 5000 records from 10001 to 15000.

    Edit:

    As @david-דודו-markovitz remarked in comments, queries should be consistent between calls. So you should prepare the main query (the elipses part) with an order by clausule and ensure you retrieve the same records even if the data source changes over time, for example, with a where between times.