Search code examples
azuresftpazure-data-factoryazure-sql-server

Ingest Data From On-Premise SFTP Folder To Azure SQL Database (Azure Data Factory)


Usecase: I have data files of varying size copied to a specific SFTP folder periodically (Daily/Weekly). All these files needs to be validated and processed. Then write them to related tables in Azure SQL. Files are of CSV format and are actually a flat text file which directly corresponds to a specific Table in Azure SQL.

Implementation:

Planning to use Azure Data Factory. So far, from my reading I could see that I can have a Copy pipeline in-order to copy the data from On-Prem SFTP to Azure Blob storage. As well, we can have SSIS pipeline to copy data from On-Premise SQL Server to Azure SQL.

But I don't see a existing solution to achieve what I am looking for. can someone provide some insight on how can I achieve the same?


Solution

  • I would try to use Data Factory with a Data Flow to validate/process the files (if possible for your case). If the validation is too complex/depends on other components, then I would use functions and put the resulting files to blob. The copy activity is also able to import the resulting CSV files to SQL server.

    You can create a pipeline that does the following:

    1. Copy data - Copy Files from SFTP to Blob Storage
    2. Do Data processing/validation via Data Flow
    3. and sink them directly to SQL table (via Data Flow sink)

    Of course, you need an integration runtime, that can access the on-prem server - either by using VNet integration or by using the self hosted IR. (If it is not publicly accessible)