I have a case where I need to ingest CSV files into CosmosDb. So I have one DataSets to process the CSV, and another to prepare CosmosDb schema.
In the pipeline, I have a CopyData task mapping from CSV and then writing in Cosmos. In the CopyData Source parameter, I specify an Azure Blob Storage where CSV are stored.
Until now, there was no problem. Thing is, I now need to find a way to ensure that blobs are ingested like an alphabeticaly ordered files array (based on fileName).
Is there a way ?
It's hard to sort by fileNames in ADF.
One way to achieve:
Save all your fileNames in a csv file, then use Sort activity in Data Flow and overwrite this file. Finally, use Lookup and For Each activity to copy blobs to Cosmos DB.
Another way:
Pass childItems of Get Metadata activity's output to Azure Function. Then sort fileNames in Azure Function. Finally, loop output of Function by For Each activity and copy to Cosmos DB.