Search code examples
sqlsql-serverinner-join

How can I find the remaining rows in the SQL query


I use two data source (tables) ODS.dbo.Account_Address__c and PAF.dbo.Delivery_Addresses to make a statistical summary about the address data in table ODS.dbo.Account_Address__c. Table PAF.dbo.Delivery_Addresses was for look-up and inner join.

All data of interest is all rows since CreatedDate=2016-09-28 and also those with IsCurrent=1 prior to 2016-09-28.

Table 1:

SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate>='2016-09-28'

UNION ALL

SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate<'2016-09-28' AND IsCurrent=1 

There are totally 265773 rows.

Then I subset the data by inner join with PAF.dbo.Delivery_Addresses,241565 rows were returned.

Table 2:

SELECT 
     aa.DPID__c, aa.CreatedDate     
FROM 
     ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
WHERE
     aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
     AND aa.CreatedDate>='2016-09-28'
     AND aa.DPID__c IS NOT NULL

UNION ALL

SELECT
     aa.DPID__c,aa.CreatedDate
FROM    
     ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
WHERE
     aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
     AND aa.CreatedDate<'2016-09-28'
     AND aa.DPID__c IS NOT NULL
     AND aa.IsCurrent=1

Subsequently I also check two other subsets.

Table 3:

NULL DPID__c of 11170 rows:

SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate>='2016-09-28' AND DPID__c IS NULL 

UNION ALL

SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate<'2016-09-28' AND IsCurrent=1 AND DPID__c IS NULL 

And reject inner join set, 12982 rows returned.

Table 4:

SELECT a.DPID__c,a.CreatedDate
FROM
(
 SELECT DPID__c,CreatedDate
 FROM ODS.dbo.Account_Address__c
 WHERE CreatedDate>='2016-09-28' AND DPID__c IS NOT NULL

 UNION ALL

 SELECT DPID__c,CreatedDate
 FROM ODS.dbo.Account_Address__c
 WHERE CreatedDate<'2016-09-28' AND DPID__c IS NOT NULL AND IsCurrent=1
) a

EXCEPT

SELECT t.DPID__c,t.CreatedDate
FROM
(
 SELECT 
    aa.DPID__c, aa.CreatedDate      
 FROM   
    ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
 WHERE
    aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
    AND aa.CreatedDate>='2016-09-28'
    AND aa.DPID__c IS NOT NULL

 UNION ALL

 SELECT
   aa.DPID__c, aa.CreatedDate           
 FROM   
   ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
 WHERE
   aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
   AND aa.CreatedDate<'2016-09-28'
   AND aa.DPID__c IS NOT NULL
   AND aa.IsCurrent=1
 ) t

As verified, Count_Row(Table 1)>Count_Row(Table 2)+Count_Row(Table 3)+Count_Row(Table 4) and still have 56 extra rows in Table 1 but not in any other tables.

Could anyone help to suggest how this gap could occur and how can I detect the remaining 56 rows?

Cheers


Solution

  • your except removes duplicates. you should use an except all. Unfortunately SQL server doesn't have except all, but that is probably where your lines are dropped.

    Check the unique lines in each query, you should have the same number.