I am new to Synapse and I have to make a pipeline that will delete files from folders in a hierarchy like the attached image. expecting hierarchy. The red half circles mark the files I would like to delete files for example older than 2 months.
As for now I have made a pipline for a single folder and using the for each loop I can get to the files and delete the corresponding one. And it works, since I have about 60-70 folders and even more files I wanted to go a level higher up and make a pipeline for each folder to execute. And with this is a problem. When i use GetMetadata Activity for top folder, and use for each loop to take name folders then i can not acess files in folder just only folder. Could you help me someone how to slove this?
We can achieve this using nested for each
activities with the help of execute pipeline
activity. As mentioned, Get metadata
with wildcards returns all files without folders and Delete
activity is unable to recognize wildcard folder paths(Folder/*).
req_files
(sample1.csv and sample2.csv) with names of files required.Note: If you want to dynamically do this, you can use append variable
to build required file names (file09/22 and file08/22).
get metadata
to get folder names (which are inside root folder). I am iterating through the output of get metadata in my for each
activity (items value is @activity('root folder contents').output.childItems
).get metadata
activity to loop through each of the sub folders (to get file contents).execute pipeline
to implement nested for each. Create 3 parameters in a new pipeline called delete_pipeline
(where I perform delete) as current_folder, folder_files and files_needed
.current_folder: @item().name
folder_files: @activity('sub folder contents').output.childItems
files_needed: @pipeline().parameters.req_files
delete_pipeline
, I have a for each loop to loop through the list of files we are passing (items value is @pipeline().parameters.folder_files
).If condition
activity. This is because I want to delete files which are not in my req_files
parameter (array from parent pipeline which we passed to files_needed
parameter in delete_pipeline
). The condition for if condition
activity will be as following:@contains(pipeline().parameters.files_needed,item().name)
We need to delete the file only when it is not present in req_files (files_needed)
. So, when the condition is false, we perform delete.
I have created 2 parameters file_namepath_of_file_to_delete
and file_name_to_delete
in the dataset I am using for delete activity with following dynamic content.
file_namepath_of_file_to_delete: Folder/@{pipeline().parameters.current_folder}
file_name_to_delete: @item().name
When I run the pipeline, it keeps the required files and deletes the rest. The following are output images for reference.