Search code examples
sql-servermergessisetlstaging

SSIS 14 - Staging Area - Merge two sources is taking a lot of time


I've two tables:

  • Table A: 631 476 rows
  • Table B: 12 90 rows

Eache Table have the Field ID that I want to use it as Key in Merge Object. In the following image is possible to see that the process blocks before the Merge Object. I already test with Merge Join object and results are the same... enter image description here

Which other possibilities I have in order to make this operation using SSIS 14?

Thanks!


Solution

  • If both sources tables are in the same server, Don't use this way. You should simply write an query in SQL Server side.

    Something like this :

    SELECT * 
    FROM [Table A]
    INNER JOIN [Table B] ON [Table A].ID  = [Table B].ID 
    ORDER BY ... 
    

    As James Serra said : When to use T-SQL or SSIS for ETL

    1. Performance – With T-SQL, everything is processed within the SQL engine. With SSIS, you are bringing all the data over to the SSIS memory space and doing the manipulation there. So if speed is an issue, usually T-SQL is the way to go, especially when dealing with a lot of records. Something like a JOIN statement in T-SQL will go much faster than using lookup tasks in SSIS. Another example is a MERGE statement in T-SQL has much better performance than a SCD task in SSIS for large tasks
    2. Features/capabilities – Some features can only be done in either T-SQL or SSIS. You can shred text in SSIS, but can’t in T-SQL. For example, text files with an inconsistent number of fields per row can only be done in SSIS. So certain tasks may force you into using one or the other
    3. Current skill set – Are the people in your IT department more familiar with SSIS or T-SQL?
    4. Ease of development/maintenance – Of course, whatever one you are most familiar with will be the easiest, but if your skills at both are fairly even, then SSIS is usually easier to use because it is graphical, but sometimes you can develop quicker in T-SQL. For example, having to join a bunch of tables will require a bunch of tasks in SSIS, where in T-SQL it is one statement. So it might be easier to create the tasks to join the tables in SSIS, but it will take longer to build then writing a T-SQL statement
    5. Complexity – SSIS can be more complex because you might need to create many tasks to accomplish your objective, where in T-SQL it might just be one statement, like in the example above for joining tables
    6. Extensibility – SSIS has better extensibility because you can create a script task that uses C# that can do just about anything, especially for non-database related tasks. T-SQL is limited because it is only for database tasks. SSIS also has logging, which T-SQL does not
    7. Likelihood of depracation/breaking changes – Minor issue, but T-SQL is always removing features in each release that will have to be rewritten
    8. Types/architecture of sources and destinations – SSIS is better if you have multiple types of sources. For example, it works really well with Oracle, XML, flat-files, etc. SSIS was designed from the beginning to work well with other sources, where T-SQL is designed for SQL Server and it requires more steps to access other sources, and there are additional limitations when doing so
    9. Local regulations – Are there some company standards you have to adhere to that would limit which tool you can use?