Search code examples
azure-data-factoryazure-blob-storage

Azure Blob Storage - is there no UnDelete event availabe?


I'm working with Azure DataFactory as part of churning some backup/restore usecases - looking at event types for Azure Blob Storage for Gen2 storage accounts :

https://learn.microsoft.com/en-us/azure/event-grid/event-schema-blob-storage?toc=%2Fazure%2Fstorage%2Fblobs%2Ftoc.json&tabs=cloud-event-schema#data-lake-storage-gen-2-events

...it seems that no event is raised when a blob is undeleted :o/

It surprises me a bit if this is the case... so dear community, is it really so ?


Solution

  • As per the documentation,

    There is no undelete event for the storage accounts

    You can raise it as a feature request here.

    You can try below workaround using ADF to achieve your requirement.

    This approach requires another parent pipeline to your original pipeline.

    • First, add a Get meta data activity to your original pipeline at the start and get all the child items file names list from your storage location.
    • At the end of your pipeline, add a copy activity and store this child items JSON array in a JSON file in a temporary blob storage.
    • You need to use copy activity additional column for that. Check this SO answer to copy a JSON string into a JSON file.
    • Whenever, the pipeline is triggered by any event or by any parent pipeline, this pipeline will process the triggered file and will store the updated child items in the temp JSON file.

    Now, create the parent pipeline like below.

    • Create a lookup activity with the above JSON file. This will give already processed child items list.
    • Create a Get meta data activity for your location and get all child items. This will give all child items including any new files (newly created or deleted files will trigger the original pipeline and gets updated in the JSON file. So, the only extra files will be the undeleted files).
    • Compare these two arrays using Filer activity and get the file names list which are not in the lookup output array. Use For-Each on the filtered output array and call the original pipeline inside it and also pass the file name to it as a parameter. This file will also get updated in the JSON file.
    • Schedule this parent pipeline on regular basis like daily or for every 12 hours. Even though it might not trigger the pipeline whenever the file is undeleted, but it will make sure the undeleted files get processed on regular basis.