Search code examples
business-intelligencessisbids

Import most recent data from CSV to SQL Server with SSIS


Here's the deal; the issue isn't with getting the CSV into SQL Server, it's getting it to work how I want it... which I guess is always the issue :)

I have a CSV file with columns like: DATE, TIME, BARCODE, etc... I use a derived column transformation to concatenate the DATE and TIME into a DATETIME for my import into SQL Server, and I import all data into the database. The issue is that we only get a new .CSV file every 12 hours, and for example sake we will say the .CSV is updated four times in a minute.

With the logic that we will run the job every 15 minutes, we will get a ton of overlapping data. I imagine I will use a variable, say LastCollectedTime which can be pulled from my SQL database using the MAX(READTIME). My problem comes in that I only want to collect rows with a readtime more recent than that variable.

Destination table structure: ID, ReadTime, SubID, ...datacolumns..., LastModifiedTime where LastModifiedTime has a default value of GETDATE() on the last insert.

Any ideas? Remember, our readtime is a Derived Column, not sure if it matters or not.


Solution

  • Here is one approach that you can make use of:

    Let's assume that your destination table in SQL Server is named BarcodeData.

    1. Create a staging table (say BarcodeStaging) in your database that has the same column structure as your destination table BarcodeData into which CSV data is imported into.

    2. In the SSIS package, add an Execute SQL Task before the Data Flow Task to truncate the staging table BarcodeStaging.

    3. Import the CSV data into the staging table BarcodeStaging and not into the actual destination table.

    4. Use the MERGE statement (I assume that you are using SQL Server 2008 or higher version), to compare the staging table BarCodeStaging and the actual destination table BarcodeData using the DateTime column as the join key. If there are unmatched rows, then copy the rows from the staging table and insert them into the destination table.

    Technet link to MERGE statement: http://technet.microsoft.com/en-us/library/bb510625.aspx

    Hope that helps.