Search code examples
azureazure-sql-serverazure-data-factory

Azure Datafactory: Dataflow can't access database


I've been building a nice ETL solution with Data Factory. We're bound to go to the production stage and now security becomes a real concern. Somehow I can't seem to get this right....

I've set up a CosmosDB and SQL Server/DB in Azure. I've added those to a virtual network and disallowed any connections outside of that network. In DataFactory I've set up an Integration Runtime with Virtual network configuration. I've added a Managed Private Endpoint in DataFactory, connected to the SQL server. I've set up a Linked Service to the SQL server using that endpoint. When I set up a Dataset using that Linked Service it works as expected. I can test the connection succesfully, select a table and retrieve it's schema.

However..... I've set up a dataflow that retrieves data from the CosmosDB, does all kinds of magic with it and writes it to a sink using the dataset defined above. When I try to test the connection on this sink, it fails stating that it can't access the SQL database.

I'm assuming this has something to do with the difference between running a pipeline (using the IR) and running a dataflow (whatever that uses?). I can't, however, find what runs that dataflow and how to make sure that "thing" can access the SQL server.

What am I doing wrong?


Solution

  • After activating, then deactivating, the "Deny public network access" option on the SQL Server "Firewalls and Virtual Networks" setting page, everything started working as expected.

    Apparently "Have you tried turning it off an on again" is still the solution to some issues....