Search code examples
oraclessisoracle19cssis-2016

Best way to transfer data IN BATCHES/BULK from Oracle 19c to SQL Server 2016 using SSIS


We have a legacy process that runs on SSIS 2016 on Windows Server 2016, executes custom queries against databases on remote servers, pulls the results (thousands or sometimes millions of records) and stores them in a local SQL Server database. These other databases are on DB2 and Oracle 19c.

This process has always connected using an OLE DB driver and a data flow with OLE DB source and destination components. It also has always been slow.

Because of some article we read recently talking about how OLE DB transfers only 1 record at a time, but with ADO.NET this network transfer could be done in batches (is this even true?), we decided to try to use an ADO.NET driver to connect to DB2 and replace the OLE DB source and destination components by ADO.NET components.

The transfer we were using as test case, which involved 46 million records, basically flew and we could see it bring down around 10K records at a time. Something that used to run in 13 hours ran in 24 minutes with no other changes. Some small tweaks in the query allowed us to bring that time even lower to 11 minutes.

This is obviously major and we want to be able to replicate it with our Oracle data sources. Network bandwidth seems to have been the main issue, so we want to be able to transfer data from Oracle 19c to our SQL Server 2016 databases using SSIS in batches, but want to ask the experts what the best/fastest way to do this is.

Is Microsoft Connector for Oracle the way to go as far as driver? Since we're not on SQL Server 2019, this article says we also need to install the Oracle Client and Microsoft Connector Version 4.0 for Oracle by Attunity. What exactly is the Oracle Client? Is it one of these? If so, which one, based on our setup?

Also, should we use ADO.NET components in the data flow just like we did with DB2? In other words, is the single record vs. record batches difference driven by the driver used to connect, the type of components in the data flow or both need to go hand in hand for this to work?

Thanks in advance for your responses!


Solution

  • OLEDB connections are not slow by themselves - it's a matter or what features the driver has available to it. It sounds like the ADO.NET driver for DB2 allows bulk insert and the OLEDB one does not.

    Regarding Oracle, the attunity driver is the way to go. You'll need to install the oracle driver as well. The links that you have look correct to me but I don't have access to test.

    Also, please note that dataflows will batch data by default in increments of the buffer size. 10k rows for example.