Search code examples
t-sqlcomparecommon-table-expressionsql-server-2017not-exists

SQL 2017 - Comparing values between two tables where certain values can be NULL


I have the following Tables with the following data:

CREATE TABLE TestSource (
    InstrumentID int,
    ProviderID int,
    KPI1 int,
    Col2 varchar(255),
    KPI3 int
    );

CREATE TABLE TestTarget (
    InstrumentID int,
    ProviderID int,
    KPI1 int,
    Col2 varchar(255),
    KPI3 int 
    );

INSERT INTO TestSource (InstrumentID,ProviderID,KPI1,Col2,KPI3)
VALUES  (123, 27, 1, 'ABC', 10.0 ),
            (1234, 27, 2, 'DEF', 10.0 ),
            (345, 27, 1, NULL, 0.00 );

INSERT INTO TestTarget (InstrumentID,ProviderID,KPI1,Col2,KPI3)
VALUES  (123, 27, 1, 'ABC', 10.0 ),
            (1234, 27, 2, 'DEF', 10.0 ),
            (345, 27, 1, 'ABC', 0.0 );

I'm trying to compare the values between tables. Here's the query logic I am currently using:

DECLARE @Result NVARCHAR(max)

;WITH 

compare_source (InstrumentID,ProviderID,

/*** Source columns to compare ***/

            Col1Source, Col2Source,Col3Source

)

as (

              select     InstrumentID
                        ,ProviderID
                        ,KPI1
                        --,ISNULL(Col2,'NA') as Col2
                        ,Col2
                        ,KPI3

              from TestSource

              group by
                         InstrumentID
                        ,ProviderID
                        ,KPI1
                        ,Col2
                        ,KPI3
),

compare_target (InstrumentID,ProviderID,

/*** Target columns to compare ***/

            Col1Target,Col2Target,Col3Target

)

as 

(
            select
                     InstrumentID
                    ,ProviderID
                    ,KPI1
                    --,1
                    ,Col2
                    ,KPI3


            from TestTarget

            group by

                     InstrumentID
                    ,ProviderID
                    ,KPI1
                    ,Col2
                    ,KPI3
)

    SELECT @Result = STRING_AGG ('InstrumentID = ' + CONVERT(VARCHAR,InstrumentID)

         + ', Col1: ' + CONVERT(VARCHAR,Col1Source) + ' vs ' + CONVERT(VARCHAR,Col1Target)

        + ', Col2: ' + CONVERT(VARCHAR,Col2Source) + ' vs ' + CONVERT(VARCHAR,Col2Target)

        + ', Col3: ' + CONVERT(VARCHAR,Col3Source) + ' vs ' + CONVERT(VARCHAR,Col3Target) 

    , CHAR(13) + CHAR(10)

    )

FROM 
(
            select 
                     s.InstrumentID
                    ,s.Col1Source
                    ,t.Col1Target
                    ,s.Col2Source
                    ,t.Col2Target
                    ,s.Col3Source
                    ,t.Col3Target 

            from compare_source s

            left join compare_target t on t.InstrumentID = s.InstrumentID and t.ProviderID = s.ProviderID

            where not exists

            (
               select 1 from compare_target t where

                s.InstrumentID = t.InstrumentID AND 
              ( s.Col1Source   = t.Col1Target ) OR (ISNULL(s.Col1Source, t.Col1Target) IS NULL)  AND
              ( s.Col2Source   = t.Col2Target ) OR (ISNULL(s.Col2Source, t.Col2Target) IS NULL)  AND
              ( s.Col3Source   = t.Col3Target ) OR (ISNULL(s.Col3Source, t.Col3Target) IS NULL) 
        )

) diff

PRINT @Result

When there are no NULL values in my tables, the comparison works well. However, as soon as I attempt to insert NULLs in either of the tables, my comparison logic breaks down and does not account for the differences between tables values.

I know that I could easily do an ISNULL on my columns in my individual selects, however, I'd like to keep it as generic as possible and to only do my comparison checks and NULL checks in my final NOT EXISTS comparison WHERE clause.

I've also tried the following logic in my comparison logic without success:

            (
                   select 1 from compare_target t where

                    s.InstrumentID = t.InstrumentID AND 
                  ( s.Col1Source   = t.Col1Target OR (s.Col1Source IS NULL AND t.Col1Target IS NULL) ) AND
                  ( s.Col2Source   = t.Col2Target OR (s.Col2Source IS NULL AND t.Col2Target IS NULL) ) AND
                  ( s.Col3Source   = t.Col3Target OR (s.Col3Source IS NULL AND t.Col3Target IS NULL) )
            )

Another issue I am having is that my query cannot distinguish between data formats (for example, it sees the value 0.00 as equivalent to 0.0)

I'm not totally certain as to what I am missing.

Any help to put me on the right path would be great.


Solution

  • Well the two problems I see are this:

    1. The WHERE clause at the bottom needs to have extra parenthesis to combine your ORs with your ANDs so that the order of precedence is correct:

        select 1 from compare_target t where
      
          s.InstrumentID = t.InstrumentID AND 
        (( s.Col1Source   = t.Col1Target ) OR (ISNULL(s.Col1Source, t.Col1Target) IS NULL))  AND
        (( s.Col2Source   = t.Col2Target ) OR (ISNULL(s.Col2Source, t.Col2Target) IS NULL))  AND
        (( s.Col3Source   = t.Col3Target ) OR (ISNULL(s.Col3Source, t.Col3Target) IS NULL)) 
      
    2. When you make that change the one row that is returned has a NULL value in the Col2Source column. So when you try and build the string that you are sending to STRING_AGG it has a NULL in the middle of it. So the entire string will be NULL. So you will need to use ISNULL in either the subquery in your FROM clause or within the STRING_AGG()....or is suppose right where you had it commented out.