Search code examples
azure-storageazure-log-analyticskql

Strategies for working with more than 30k records in Azure Log Analytics results?


I have a Log Analytics Diagnostic Setting on a very active ADLS Gen2 Storage Account. The goal is to reconcile blobs uploaded to Containers within the Storage Account with blobs processed by an Azure Function.

Problem: Azure Log Analytics does not return result sets > 30k records

enter image description here

Ideally, the reconciliation happens all at once; compare incoming blobs with processed blobs at the end of the day.

But this doesn't seem possible if there are > 30k records. Seems like I'll have to schedule some kind of hourly reconcile (not ideal).

What are some strategies for handling this in a simple way?


Solution

  • You could use a cursor in the filter. As example you do a query to obtain a set of results sorting on a specific column that's preferably unique. Then just take the last event and use that in your subsequent request as filter to get the next results. I'm not sure if monitoring garantuees that 100% of the incoming requests are available.

    You could also look at using proper event driven resources. As example you could setup an Event Grid Subscription on the resource to push a blob creation event to either a Queue or Service Bus. Then process your blobs using that information.