Search code examples
azure-blob-storageazure-data-factorylast-modified

Get last modified folder using Azure Data Factory


I am trying to get the directory (i.e. /DATE=2023-01-03/ID=333) that has been modified within the last 5 minutes, then pass that folder path as a value to the source path of a Copy data activity. I've tried using Get Metadata and For Each activities, but it didn't work the way I expected.

Here is what the folder structure looks like:

silver/

|---------DATE=20230101/
|---------|---------ID=111
|---------|---------|---------part-1.c000.snappy.parquet
|---------|---------|---------part-2.c000.snappy.parquet
|---------|---------|---------committed_123    
|---------DATE=20230103/
|---------|---------ID=222
|---------|---------|---------part-1.c000.snappy.parquet
|---------|---------|---------part-2.c000.snappy.parquet
|---------|---------|---------committed_123
|---------|---------ID=333
|---------|---------|---------part-1.c000.snappy.parquet
|---------|---------|---------part-2.c000.snappy.parquet
|---------|---------|---------committed_123
|---------_SUCCESS

In this example, I'd like to get the folder path (/DATE=2023-01-03/ID=333), which was last modified within last 5 minutes, then pass that path as a value for source path of a Copy activity. However, _SUCCESS file also gets updated and is most likely to appear in the top of the last modified datetime list, so I'd like to exclude _SUCCESS and just get the last modified folder path passed to the Copy activity.


Solution

    • You can use an if activity to check whether the particular item is a file or a folder. The following is my folder structure for demonstration:

    enter image description here

    • I have used get metadata activity to get child items on this container and my output is as shown in the below image:

    enter image description here

    • Now, Instead of directly trying to get the latest last modified folder on these child items, use an if condition activity to check whether the particular child's type is file or folder. I have used the following as my condition:
    @equals(item().type,'Folder')
    

    enter image description here

    • Or you can also use filter activity to filter out File type child items (after the first get metadata before for each loop). Using this condition will only allow you to get latest last modified date of folders only:

    enter image description here