Search code examples
sql-servert-sqldynamic-sqlsp-executesql

sp_executesql In Loop Taking FOREVER to Run. Slowing down as it runs. Memory Leak?


Long and short is that my query is taking FOREVER to run and I think it has to do with the sp_executesql queries. Yes, there are loops, and yes I am using Dynamic SQL.

I use loops enough to know there is something going on, this is taking far longer than it should to run. The only thing I can think is that the sp_executesql is not well formed or something.

The [tmp_compare] table will have about 8500 records, and there are 15 fields it compares, so I know that's 127,000 "processes", but it should not take HOURS to run, right?!

I tracked it running for 90min and it starts running at 84 records (entries into data_log) per second. After an hour it's down to 14 records/second. It gradually slows down as it goes. Last time I saw something like this was when a program I wrote had a memory leak. I fixed the previous issue I had with the data types, and it did speed it up a bit, but something still seems wrong. Do I need to clear variables or something? What am I missing here?

Purpose: I am comparing specific columns from two tables and putting the data and proposed action based on the data into another table to act upon later. The [contact_compare_fields] table holds the field names for the two tables, and the [tmp_compare] is just a list of the IDs

Table Creation/Prep:

CREATE TABLE [dbo].[contact_compare_fields] 
(
    [id]       int identity(1,1),
    [sf_field] varchar(100),
    [h_field]  varchar(100)
);

INSERT INTO [contact_compare_fields] 
VALUES
    ('Title',               'Job Title'),
    ('FirstName',           'First Name'),
    ('LastName',            'Last Name'),
    ('MailingStreet',       'Person Street'),
    ('MailingCity',         'Person City'),
    ('MailingState',        'Person State'),
    ('MailingPostalCode',   'Person Zip Code'),
    ('MailingCountry',      'Country'),
    ('Department',          'Department'),
    ('Email',               'Email Address'),
    ('Phone',               'Direct Phone Number'),
    ('Fax',                 'Fax'),
    ('MobilePhone',         'Mobile Phone'),
    ('Job_Level__c',        'Management Level'),
    ('Job_Role__c',         'Job Function')
    
CREATE TABLE [dbo].[tmp_compare] 
(
    [id]      int identity(1,1),
    [H_ZI_ID] bigint
)

INSERT INTO [tmp_compare] 
VALUES
    ('1001122877'),('1001125385'),('1002260105'),('100233801'),('1002661679')

CREATE TABLE [dbo].[h_processed] 
(
    [Contact ID] varchar(255),
    [Job Title]  varchar(255)
)

INSERT INTO [h_processed] 
VALUES ('1001122877', 'Chief Financial Officer')

CREATE TABLE [dbo].[sf_contact] 
(
    [Contact ID] varchar(255),
    [Title]  varchar(255)
)

INSERT INTO [sf_contact] 
VALUES ('1001122877', 'CFO')

CREATE TABLE [data_log] (
    [id]        int identity(1,1),
    [dataID]    varchar(500),
    [field]     varchar(128),
    [sf_data]   varchar(500),
    [h_data]    varchar(500),
    [score]     float,
    [action]    varchar(128)
);

CODE:

DECLARE
    @fld_cnt int = 1,
    @fld_max int = 0,
    @cnt int = 1,
    @max int = 0,
    @score float,
    @zi bigint,
    @value int = 0,
    @sf_field nvarchar(500) = '',
    @h_field nvarchar(500) = '',
    @sf_data nvarchar(500) = '',
    @h_data nvarchar(500) = '',
    @ParamDef nvarchar(500) = '',
    @sql nvarchar(max)
    
DROP TABLE IF EXISTS [tmp_compare];
SELECT
    identity(int,1,1) AS [id],
    h.[Contact ID] AS [H_ZI_ID]
 INTO [tmp_compare]
 FROM [h_processed] h
  LEFT JOIN [sf_contact] s ON
    h.[Contact ID] =  s.[DOZISF__ZnID__c]
  WHERE s.[DOZISF__ZI_ID__c] IS NOT NULL 
    AND s.[DOZISF__ZI_ID__c] <> ''

SELECT @fld_max = COUNT(*) FROM [contact_compare_fields]
SELECT @max = COUNT(*) FROM [tmp_compare]


-- Cycle Through Every ID in [tmp_compare] table
WHILE (@cnt <= @max)
BEGIN
    SELECT @zi = [H_ZI_ID] FROM [tmp_compare] WHERE [id] = @cnt

    -- Cycle Through Every Field in the Table
    WHILE (@fld_cnt <= @fld_max)
    BEGIN
        SELECT @sf_field = [sf_field], @h_field = [h_field] FROM [contact_compare_fields] WHERE [id] = @fld_cnt
        
        SET @sql = N'SELECT @h_dataOUT = SDU_Tools.NULLifBlank(' + QUOTENAME(@h_field) + ') FROM [h_processed] WHERE [Contact ID] = @ziIN;';
        SET @ParamDef = N'@ziIN bigint, @h_dataOUT varchar(500) OUTPUT';
        EXEC sp_executesql @sql, @ParamDef, @ziIN = @zi, @h_dataOUT = @h_data OUTPUT;

        SET @sql = N'SELECT @sf_dataOUT  = SDU_Tools.NULLifBlank(' + QUOTENAME(@sf_field) + ') FROM [sf_contact] WHERE [DOZISF__ZI_ID__C] = @ziIN;';
        SET @ParamDef = N'@ziIN bigint, @sf_dataOUT varchar(500) OUTPUT';
        EXEC sp_executesql @sql, @ParamDef, @ziIN = @zi, @sf_dataOUT = @sf_data OUTPUT;

        INSERT INTO [data_log] VALUES
            (@zi, @sf_field, @sf_data, @h_data, NULL, '')

        IF (@sf_data = @h_data)
            UPDATE [data_log] SET [action] = 'None' WHERE [DataID] = @zi AND [field] = @sf_field;
        ELSE IF (@sf_data IS NULL) AND (@h_data IS NULL)
            UPDATE [data_log] SET [action] = 'None' WHERE [DataID] = @zi AND [field] = @sf_field;
        ELSE IF (@sf_data IS NOT NULL) AND (@h_data IS NULL)
            UPDATE [data_log] SET [action] = 'None' WHERE [DataID] = @zi AND [field] = @sf_field;
        ELSE IF (@sf_data IS NULL) AND (@h_data IS NOT NULL)
            UPDATE [data_log] SET [action] = 'UPDATE' WHERE [DataID] = @zi AND [field] = @sf_field;
        ELSE
            UPDATE [data_log] SET [action] = 'Needs Review' WHERE [DataID] = @zi AND [field] = @sf_field;

        SET @fld_cnt = @fld_cnt + 1
    END

    SET @fld_cnt = 1
    SET @cnt = @cnt + 1
END

Solution

  • This doesn't need loops at all. You just need to dynamically construct a joined query that unpivots the columns, and compares them, inserting into the log as necessary.

    DECLARE @cols nvarchar(max), @sql nvarchar(max);
    
    SELECT @cols = STRING_AGG(N'
          (' + QUOTENAME(ccf.sf_field, '''') + ', sf.' + QUOTENAME(ccf.sf_field) + ', hp.' + QUOTENAME(ccf.h_field) + ')',
        N','
      )
    FROM contact_compare_fields ccf;
    
    SET @sql = '
    INSERT data_log (dataID, field, sf_data, h_data, action)
    SELECT
      sf.[Contact ID],
      v.column_name,
      v.sf_data,
      v.h_data,
      CASE WHEN v.sf_data = v.h_data OR v.h_data IS NULL THEN ''None''
           WHEN v.sf_data IS NULL AND v.h_data IS NOT NULL THEN ''UPDATE''
           ELSE ''Needs Review''
      END
    FROM tmp_compare tc
    JOIN sf_contact sf ON sf.[Contact ID] = CAST(tc.[H_ZI_ID] AS varchar(255))
    JOIN h_processed hp ON hp.[Contact ID] = sf.[Contact ID]
    CROSS APPLY (VALUES' + @cols + '
    ) v(column_name, sf_data, h_data);
    ';
    
    PRINT @sql;   -- your friend
    
    EXEC sp_executesql @sql;
    
    SELECT * FROM data_log
    

    db<>fiddle

    Note:

    • Make sure you always quote column names with QUOTENAME.
    • Column and object names should be stored in sysname typed columns and variables.
    • If the column types differ then you should cast them all to them same type, either a string type or sql_variant.
    • In your sample, tc.[H_ZI_ID] and [Contact ID] have different data types. These should be the same type.
    • The tables should all have clustered indexes on [Contact ID] and [H_ZI_ID] respectively.