I am trying to copy folders with their files from ftp into an azure data storage, by looping through the folders and for each folder copy the content into a container that has the folder's name. for this, I used a metadata ,for each and copy data component. For now I am able to copy all the folders into the same container , but what I want is to have multiple containers named after the the folders in the output, each one containing files from the ftp.
ps : I am still new to azure data factory
Any advise or help is very welcome :)
You need to add a Get Metadata activity before the for-each. The Get Metadata activity will get the files in the current directory and pass them to the For-Each. You connect it to your Blob storage folder.
try something like this
Setup a JSON source:
Create a pipeline, use GetMetadata activity to list all the folders in the container/storage. Select fields as childItems
Feed the Metadata output (list of container contents) into filter activity and filter only folders.
Input the list of folders to a ForEach activity
Inside ForEach, set the current item() to a variable, and use it as a parameter for a parameterized source dataset which is a clone of original source !
This would result in listing the files from each folder in your container.
Feed this to another filter and this time filter on files. Use @equals(item().type,'File')
Now create another pipeline where we will have our copy activity running for each file found to be having same name as that of its parent folder.
Create parameters in the new child pipeline to receive the current Folder and File name in the iteration from Parent Pipeline to evaluate for copy.
Inside child pipeline, start with foreach whose input will be the list of filenames inside the folder received into parameter: @pipeline().parameters.filesnamesreceived
Use variable to hold the current item and use IfCondition to check if filename and folder names match.
Note: Try to evaluate dropping the file extension as per your requirement as metadata would hold the complete file name along with its extension.
If True - > the names match, copy from source to sink.
Here the hierarchy is preserved and you can also use "Prefix" to mention the file path as it copies with preserving hierarchy. It utilizes the service-side filter for Blob storage, which provides better performance than a wildcard filter.
The sub-path after the last "/" in prefix will be preserved. For example, you have source container/folder/subfolder/file.txt, and configure prefix as folder/sub, then the preserved file path is subfolder/file.txt. Which fits your scenario.
This copies files like /source/source/source.json
to /sink/source/source.json