Search code examples
sqlsql-serverssisetlssis-2012

How to Map Input and Output Columns dynamically in SSIS?


I Have to Upload Data in SQL Server from .dbf Files through SSIS. My Output Column is fixed but the input column is not fixed because the files come from the client and the client may have updated data by his own style. there may be some unused columns too or the input column name can be different from the output column.

One idea I had in my mind was to map files input column with output column in SQL Database table and use only those column which is present in the row for file id.

But I am not getting how to do that. Any idea?

Table Example

FileID InputColumn OutputColumn Active
1 CustCd CustCode 1
1 CName CustName 1
1 Address CustAdd 1
2 Cust_Code CustCode 1
2 Customer Name CustName 1
2 Location CustAdd 1

Solution

  • If you create a similar table, you can use it in 2 approaches to map columns dynamically inside SSIS package, or you must build the whole package programmatically. In this answer i will try to give you some insights on how to do that.

    (1) Building Source SQL command with aliases

    Note: This approach will only work if all .dbf files has the same columns count but the names are differents

    In this approach you will generate the SQL command that will be used as source based on the FileID and the Mapping table you created. You must know is the FileID and the .dbf File Path stored inside a Variable. as example:

    Assuming that the Table name is inputoutputMapping

    Add an Execute SQL Task with the following command:

    DECLARE @strQuery as VARCHAR(4000)
    
    SET @strQuery = 'SELECT '
    
    SELECT @strQuery = @strQuery + '[' + InputColumn + '] as [' + OutputColumn + '],'
    FROM inputoutputMapping
    WHERE FileID = ?
    
    SET @strQuery = SUBSTRING(@strQuery,1,LEN(@strQuery) - 1) + ' FROM ' + CAST(? as Varchar(500))
    
    SELECT @strQuery
    

    And in the Parameter Mapping Tab select the variable that contains the FileID to be Mapped to the parameter 0 and the variable that contains the .dbf file name (alternative to table name) to the parameter 1

    Set the ResultSet type to Single Row and store the ResultSet 0 inside a variable of type string as example @[User::SourceQuery]

    The ResultSet value will be as following:

    SELECT [CustCd] as [CustCode],[CNAME] as [CustName],[Address] as [CustAdd] FROM database1
    

    In the OLEDB Source select the Table Access Mode to SQL Command from Variable and use @[User::SourceQuery] variable as source.


    (2) Using a Script Component as Source

    In this approach you have to use a Script Component as Source inside the Data Flow Task:

    First of all, you need to pass the .dbf file path and SQL Server connection to the script component via variables if you don't want to hard code them.

    Inside the script editor, you must add an output column for each column found in the destination table and map them to the destination.

    Inside the Script, you must read the .dbf file into a datatable:

    After loading the data into a datatable, also fill another datatable with the data found in the MappingTable you created in SQL Server.

    After that loop over the datatable columns and change the .ColumnName to the relevant output column, as example:

    foreach (DataColumn col in myTable.Columns)
        {
    
        col.ColumnName = MappingTable.AsEnumerable().Where(x => x.FileID = 1 && x.InputColumn = col.ColumnName).Select(y => y.OutputColumn).First(); 
    
        }
    

    After loop over each row in the datatable and create a script output row.

    In addition, note that in while assigning output rows, you must check if the column exists, you can first add all columns names to list of string, then use it to check, as example:

    var columnNames = myTable.Columns.Cast<DataColumn>()
                                 .Select(x => x.ColumnName)
                                 .ToList();  
    
    
    foreach (DataColumn row in myTable.Rows){
    
    if(columnNames.contains("CustCode"){
    
        OutputBuffer0.CustCode = row("CustCode");
    
    }else{
    
        OutputBuffer0.CustCode_IsNull = True
    
    }
    
    //continue checking all other columns
    
    }
    

    If you need more details about using a Script Component as a source, then check one of the following links:


    (3) Building the package dynamically

    I don't think there are other methods that you can use to achieve this goal except you has the choice to build the package dynamically, then you should go with:


    (4) SchemaMapper: C# schema mapping class library

    Recently i started a new project on Git-Hub, which is a class library developed using C#. You can use it to import tabular data from excel, word , powerpoint, text, csv, html, json and xml into SQL server table with a different schema definition using schema mapping approach. check it out at:

    You can follow this Wiki page for a step-by-step guide: