Search code examples
pythonazurepipelineazure-data-factoryazure-data-lake

file checks before copying data to Azure data lake through Azure data factory


Currently I am building a data pipeline where I wanted to copy data from one blob storage to Azure data lake through Azure data factory but before creating a data pipeline i wanted to have a file check kind of thing ie it should check the directory if file found or not ,for eg: i have a csv file if file is present then start copying to adls otherwise through an error filenot found. I know we can do this in python but in adf how to add that in pipeline. Any help will be appreciated.


Solution

  • I would use metadata activity to get a list of all items in your blob storage (select your blob as a dataset): https://learn.microsoft.com/en-us/azure/data-factory/control-flow-get-metadata-activity

    Then you might need to check if an item is a file, not a folder. For that you can add a combination of "ForEach" and "If condition" activities. In that case you can refer to each item from the Metadata step using @activity('GetMetadata').output.childitems expression and @equals(item().type, 'File') expression to check if it is a file.