Search code examples
azureazure-blob-storageazure-eventgrid

Blob Created event only after Blob committed?


We have set up a Azure Function with an eventHubTrigger to read blobs created by Apache Nifi. The EventGridSchema is filtered for Blob Created events, largely working fine. The problem is that the Azure Function occasionally fetches the blob before it is fully committed.

Storage account is of type BlockBlobStorage with Hierarchical namespace. The StorageBlobLogs shows that the typical sequence of operations:

Nifi processor: CreatePathFile > AppendFile > FlushFile. Followed by the Azure Function doing Getblob > DeleteBlob. When the problem occurs, the GetBlob operation happens prior to FlushFile or even prior to AppendFile

It seems that only the CreatePathFile operation trigges Blob Created events. The docs indicates that FlushWithClose would also trigger it, and we have tried applying an event filter on data.api without achieving anything other than stopping function runs altogether.

How can we setup a subscription for Blob Created Event triggered strictly after the Blob has been fully committed?


Solution

  • How can we setup a subscription for Blob Created Event triggered strictly after the Blob has been fully committed?

    There is no configuration as it depends upon which operation your client is using for in the blob REST API. The same is documented here for the Microsoft.Storage.BlobCreated created when it is fired.

    Triggered when a blob is created or replaced. Specifically, this event is triggered when clients use the PutBlob, PutBlockList, or CopyBlob operations that are available in the Blob REST API and when the Block Blob is completely committed. If clients use the CopyBlob operation on accounts that have the hierarchical namespace feature enabled on them, the CopyBlob operation works a little differently. In that case, the Microsoft.Storage.BlobCreated event is triggered when the CopyBlob operation is initiated and not when the Block Blob is completely committed.