Search code examples
azure-functionsazure-blob-storageazure-storageazure-blob-trigger

Can you use a BlobTrigger for a blob container with millions of blobs?


I have an existing blob container with over 3 million blobs in it. I have written an Azure Function using BlobTrigger and a Blob output binding to copy the file, including it's tags, to another container on another storage account.

The Azure docs seem to indicate BlobTrigger for a standard Blob Storage container is not recommended or perhaps not supported or possible for "high-scale" containers (containers with over 100,000 blobs in them).

My function is working against this container, but it does take about 9 minutes between startup, when the host lock lease is acquired, until the first files start processing.

The problem is, I need to process the existing files, and none of the other options in that Azure doc seem to support processing of existing blobs.

Do I proceed with the function I have, or should I avoid using it due to it's long start time? Perhaps one of the event based ones are better, but then how do I "catch up" on the existing files first?


Solution

  • I do agree with @Peter Bons that for existing files you can use azcopy command with which we can copy files and tags can be preserved too and followed Microsoft-Document :

    azcopy copy 'https://mysourceaccount.blob.core.windows.net/mycontainer/myBlobDirectory' 'https://mydestinationaccount.blob.core.windows.net/mycontainer' --recursive
    

    If you want to copy containers, directories, and blobs:

    azcopy copy 'https://mysourceaccount.blob.core.windows.net/' 'https://mydestinationaccount.blob.core.windows.net' --recursive