Search code examples
sqlssisoledbcommand

Is there a limit to how many declared variables I can have in the SQLCommand area of an OLE DB Command task?


I am trying to capture which records were deleted between our on-prem staging and dbo tables before they get deleted, so that we can pass that information to azure sql and delete them in the corresponding tables there. To do this, I added a step to our SSIS package that is handling the on-prem CDC - basically an OLE DB command task with a sql command that inserts the key values of records that are about to be deleted into a separate table. This works great for any table that has 2 or fewer keys. As soon as I try to apply the same logic to a table with 3+ keys, I get a 'statement(s) could not be prepared' error.

I am guessing there is some kind of limit on how many variables I can set within the sqlCommand field, but is there a way around this?

my logic in the SQLCommand looks something like this:

`DECLARE @P3 AS INT = ? 
DECLARE @P4 AS SMALLINT = ? 
DECLARE @P5 AS SMALLINT = ? 
DECLARE @P6 AS CHAR(1) = ? 

INSERT INTO stage.records_to_delete
VALUES(‘table_name’, CONCAT(‘delete <table_name> WHERE key1 = ‘’’, @P3, ‘’’ AND key2 = ‘’’, @P4,     ‘’’ AND key3 = ‘’’, @P5, ‘’’ AND key4 = ‘’’, @P6, ‘’’), getdate())`

This inserts the table name, a sql statement to delete the particular record, and the date into my deleted records table.

What I have tried:

  • Running the sql in SSMS - it works like a charm, so I know it's not a syntax error.
  • Manually creating the OLE DB Command Input External Columns - I hoped this would bypass the issue, but it resulted in the same error.
  • Tested to make sure it wasn't a specific datatype that I was setting a variable to that was causing the problem. Each works fine individually, just not in groups of 3+.

Solution

  • My original assumption was this was going to have something to do with how SQL Server handle's parameterized statements. If you turn on a trace/extended event, you'll see it uses the P0, P1 etc nomenclature and you can then get a conflict if you use them along with ... sp_execute_sql or something like that, details are hazy.

    My first failed attempt was to merely rename your parameters like the following

    DECLARE
        @SSIS_P0 int = ?
    ,   @SSIS_P1 int = ?
    ,   @SSIS_P2 int = ?
    ,   @SSIS_P3 int = ?
    ,   @SSIS_P4 int = ?
    ,   @SSIS_P5 int = ?
    ,   @SSIS_P6 int = ?
    ,   @SSIS_P7 int = ?
    ,   @SSIS_P8 int = ?
    ,   @SSIS_P9 int = ?
    ,   @table_name sysname = 'SO_';
    INSERT INTO stage.records_to_delete
    VALUES(@table_name, CONCAT('delete ', @table_name, ' WHERE key0 = ''', @SSIS_P0, ''' '
    , ' AND key1 = ''', @SSIS_P1, ''' '
    , ' AND key2 = ''', @SSIS_P2, ''' '
    , ' AND key3 = ''', @SSIS_P3, ''' '
    , ' AND key4 = ''', @SSIS_P4, ''' '
    , ' AND key5 = ''', @SSIS_P5, ''' '
    , ' AND key6 = ''', @SSIS_P6, ''' '
    , ' AND key7 = ''', @SSIS_P7, ''' '
    , ' AND key8 = ''', @SSIS_P8, ''' '
    , ' AND key9 = ''', @SSIS_P9, ''' '
    ), getdate());
    

    The OLE DB Command component rejects that with

    Error at DFT Do the thing [OLE DB Command [46]]: SSIS Error Code DTS_E_OLEDBERROR. An OLE DB error has occurred. Error code: 0x80004005. An OLE DB record is available. Source: "Microsoft OLE DB Provider for SQL Server" Hresult: 0x80004005 Description: "Syntax error or access violation".

    But it's as you say, replace the ? with values and run in SSMS and it works.

    Thinking harder about the problem, there's no need to use OLE DB Command, that I can infer from your problem statement. The need is to store an arbitrary string that into a column. It'd be nice to figure out why the OLE DB Command isn't acting as I'd expect it but we can solve this a different way.

    OLE DB Source

    Assume I have a source component (ole/ado/odbc) that issues the following query to a SQL Server

    SELECT 0 AS Col0
    , 1 AS Col1
    , 2 As Col2
    , 3 AS Col3
    , 4 AS Col4
    , 5 As Col5
    , 6 As Col6
    , 7 As Col7
    , 8 As Col8
    , 9 As Col9
    , N'Foo' AS table_name
    , CAST(getdate() As datetime2(0)) AS insert_date;
    

    Multicast

    Add a multi-cast into your data flow whenever you have enough information to insert data into your stage.records_to_delete table. A multicast allows you to have a parallel stream of operations without incurring memory costs to double the data.

    Here, we're going to let the rest of your data flow do whatever it was doing and add a new branch to solve the "capturing arbitrary keys problem"

    Derived Column

    As we were already building the delete statement manually, instead of using a mix of SQL and SSIS, just use the SSIS Expression language to build it.

    "delete " + table_name + " WHERE key0 = '" + (DT_WSTR,100)Col0 + "' " 
    + " AND key1 = '" + (DT_WSTR,100)Col1 + "' "
    + " AND key2 = '" + (DT_WSTR,100)Col2 + "' "
    ...
    + " AND key9 = '" + (DT_WSTR,100)Col9 + "' "
    

    The Derived Column component doesn't actually accept line breaks as shown above, whereas you can use them with Variables. But the line breaks here are to show the pattern, along with the ellipses ...

    Reading that, we concatenate + strings along with the columns from our data source. As we could have a variety of types, I explicitly cast the columns to a unicode string of length 100 (DT_WSTR, 100) ColName

    @billinkc, I'm going to downvote because you're demonstrating something that is subject to SQL Injection. Yup, I am. Carry on, that's not the problem we're solving. This pattern is subject to bad escaping if the source data is already a string and contains a single quote but given the samples, I'm not worried about it.

    If your source component does not have the table_name and insert_date or their equivalent in it, then you'll want to address that here.

    Add another column to the Derived Column component call it something like insert_date and use @[System::StartTime] as the expression. This is analogous to get_date but it will now be a column in our data flow.

    Add a column for your table_name if you want a really flexible approach. I hard coded table_name as "Foo" in the source query. Solve as you wish.

    The need is for all 3 columns to be defined at this point in the data flow so the next component can operate.

    OLE DB Destination

    Rather than dealing with OLE DB Command, which issues a statement per row flowing through it, let's use a component which is designed for sending batches of inserts, like our friend OLE DB Destination.

    In the Name of table or view, choose [stage].[records_to_delete]

    In the Columns tab, I map my 3 columns to our destination

    enter image description here