I have to load daily a lot of data out of files into a database table. The data is out of an ERP-System and has 5,000,000 records and is delivered in csv files up to 500mb. There is already a integration service job which is responsible for this and a transformation, data check and import into another system. I forgot to say there is no single id column, else it would be easy. (3 to 7 id columns) It is also not possible that the ERP-system only provides the data which has changed since the last execution.
The problem is to improve the merge of the csv files into data table. The table has to hold all the time the whole data of the previous execution.
Through bad performance I have to renew this import. Following solution have been validated:
Now the question, does anybody have experience with such requirements? Does anybody have a good suggestion to solve this problem?
Thanks in advice!
MH
I haven't started the implementation yet, but after I talked to a Integration Services specialist, it is the best solution to solve this by a temp table and a look up. First you need to create the temp table with a TSQL-Task and the you insert the records of the file with the usual copy task. In the look up you decide if it is an insert or an update.
In the next two weeks I will work on the implementation, so don't hesitate to ask me about further details.