Once a week, a file is generated to an onprem folder. My pipeline pulls from that onprem file to blob storage, then from blob to blob, during this part the pipeline filters my data goes to sql. Problem is, when it gets filtered the modified date changes and all the files in the blob storage are pulled rather than the one that got originally pulled for that week. I have attached images of my pipeline and the onprem files and what I filter for.
Instead of trying to proceed with last modified date of the file, you can proceed using file name instead.
Since you have date (yyyyddMM format) in the filename itself, you can dynamically create the filename and check if this file is present in the filtered files list or not.
Look at the following demonstration. Let's say I have the following 2 files as my filtered files. I used Get metadata
activity (child items) on the blob storage.
set variable
activity (variable name is file_name_required
).@concat('SalesWeekly_',formatDateTime(utcnow(),'yyyyddMM'),'.csv')
Now, create an array containing all the filenames returned by our get metadata
activity. The for each
activity items value is given as @activity('Get Metadata1').output.childItems
.
Inside this, use an append variable activity with value as @item().name
.
if condition
activtiy with the following dynamic content.@contains(variables('files_names'),variables('file_name_required'))
The following are reference images of the output flow.
I have used wait
for demo here. You can replace it with copy activity (from blob to SQL) in True
case. If you don't want to insert when current week file is missing, then leave the false case empty.