Search code examples
azure-synapseazure-data-factory

How to copy many csv files using Synapse Pipelines from an online source with the date in the file name?


There is this git repository publicly available. It's being refreshed daily. There are several csv files with the structure like "DA-01-12-2022", "DA-02-12-2022", "DA-03-12-2022" and so on. The date is in the file name. It's also in the githublink, so I can copy one file without problem but since there are many CSV files in the git folder how can I use Synapse pipelines to copy all the files in the git repository to a storage in azure. I feel like I have to use loops but how can I tell it to use the date?

Thanks and best regards!


Solution

  • You can use a copy activity to load all csv and store in one parquet files

    enter image description here

    You can also use ITER activity to make loop with date