I use two data source (tables) ODS.dbo.Account_Address__c
and PAF.dbo.Delivery_Addresses
to make a statistical summary about the address data in table ODS.dbo.Account_Address__c
. Table PAF.dbo.Delivery_Addresses
was for look-up and inner join.
All data of interest is all rows since CreatedDate=2016-09-28
and also those with IsCurrent=1
prior to 2016-09-28
.
Table 1:
SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate>='2016-09-28'
UNION ALL
SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate<'2016-09-28' AND IsCurrent=1
There are totally 265773 rows
.
Then I subset the data by inner join with PAF.dbo.Delivery_Addresses
,241565 rows
were returned.
Table 2:
SELECT
aa.DPID__c, aa.CreatedDate
FROM
ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
WHERE
aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
AND aa.CreatedDate>='2016-09-28'
AND aa.DPID__c IS NOT NULL
UNION ALL
SELECT
aa.DPID__c,aa.CreatedDate
FROM
ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
WHERE
aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
AND aa.CreatedDate<'2016-09-28'
AND aa.DPID__c IS NOT NULL
AND aa.IsCurrent=1
Subsequently I also check two other subsets.
Table 3:
NULL DPID__c
of 11170 rows
:
SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate>='2016-09-28' AND DPID__c IS NULL
UNION ALL
SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate<'2016-09-28' AND IsCurrent=1 AND DPID__c IS NULL
And reject inner join set, 12982 rows
returned.
Table 4:
SELECT a.DPID__c,a.CreatedDate
FROM
(
SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate>='2016-09-28' AND DPID__c IS NOT NULL
UNION ALL
SELECT DPID__c,CreatedDate
FROM ODS.dbo.Account_Address__c
WHERE CreatedDate<'2016-09-28' AND DPID__c IS NOT NULL AND IsCurrent=1
) a
EXCEPT
SELECT t.DPID__c,t.CreatedDate
FROM
(
SELECT
aa.DPID__c, aa.CreatedDate
FROM
ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
WHERE
aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
AND aa.CreatedDate>='2016-09-28'
AND aa.DPID__c IS NOT NULL
UNION ALL
SELECT
aa.DPID__c, aa.CreatedDate
FROM
ODS.dbo.Account_Address__c aa ,PAF.dbo.Delivery_Addresses da
WHERE
aa.DPID__c = CONVERT(VARCHAR,da.DELIVERY_POINT_ID)
AND aa.CreatedDate<'2016-09-28'
AND aa.DPID__c IS NOT NULL
AND aa.IsCurrent=1
) t
As verified, Count_Row(Table 1)>Count_Row(Table 2)+Count_Row(Table 3)+Count_Row(Table 4)
and still have 56
extra rows in Table 1
but not in any other tables.
Could anyone help to suggest how this gap could occur and how can I detect the remaining 56 rows
?
Cheers
your except
removes duplicates. you should use an except all
.
Unfortunately SQL server doesn't have except all
, but that is probably where your lines are dropped.
Check the unique lines in each query, you should have the same number.