Search code examples
sqlsql-serverssissql-server-2014

Loading files with dynamically generated columns


I need to create a SSIS project that loads daily batches of 150 files into a SQL Server database. Each batch always contains the same 150 files and each file in the batch has a unique name. Also each file can either be a full or incremental type. Incremental files have one more column than the full files. Each batch contains a control file that states if a file is full or incremental. See example of a file below:

Full File

|  SID |  Name  |  DateOfBirth  |
|:---: | :----: | :-----------: |
| 1    | Samuel |   20/05/1964  |
| 2    | Dave   |   06/03/1986  |
| 3    | John   |   15/09/2001  |

Incremental File

|  SID |  Name  |  DateOfBirth  |  DeleteRow   |
|:---: | :----: | :-----------: | :----------: |
| 2    |        |               |  1           |      
| 4    | Abil   |   19/11/1993  |  0           |
| 5    | Zainab |   26/02/2006  |  0           |

I want to avoid creating 2 packages (full and incremental) for each file.

Is there a way to dynamically generate the column list in each source/destination component based on the file type in the control file? For example, when the file type is incremental, the column list should include the extra column (DeleteRow).


Solution

  • Let's assume my ControlFile.xlsx is :

    Col1        Col2
    File1.xlsx  Full
    file2.xlsx  Incremental
    

    Flow:

    1.Create a DFT where ControlFile.xlsx is captured in an object variable. Source : Control connection, Destination : RecordSet Destination

    1. Pass this object variable in ForEach loop. ResultSet variable should be capturing Col2 of ControlFile.xlsx.

    2. Create a Sequence container just for a start point. Add 2 DFD for full load and incremental load. Use the constraints (as shown in below image) to decide which DFD will run.

    3. Inside DFD, use excel source to OLEDB destination.

    4. Use FilePath variable for connection property in Full load and incremental excel connections to make it dynamic.

    Step1: overall image

    img_main

    Step2: In DFT - read control file, you read the FlowControl.xlsx to save it RecordSet destination, into RecordOutput variable

    img2

    Step3: Your precedence constraints should look like below image("Full" for full load, "Incremental" for incremental load ) :

    img3

    Use the source and destination connections as shown in first image. It's a bit hard to explain all the steps, but flow is simple.

    one thing to notice is, you have additional column in Incremental, hence you'll need to use 'Derived Column' in your full load for correct mapping.

    Also, make sure DelayValidation property is set to true.

    For each loop container uses For each ADO Enumerator. Following images describe the properties :

    img

    AND

    imgx