I am querying a data warehouse (so I cannot redesign tables), I'll do my best to mimic the scenario in a simple example.
We have 3 main tables for incident, change, and release. These 3 are connected through an intermediate table called intermediate. Here is their structure along with sample data:
Incident Table:
Change Table:
Release Table:
Intermediate Table:
The first 3 tables have exact same structure, but the intermediate table holds connection of these 3 tables pairwise. For example, if Rel1 is connected to Chg1, you have a row in the intermediate table as or . These two rows have no difference and may not co-exist.
QUERY:
I want ALL release records along with number of related incidents and number of related changes. Here is how I achieved this:
WITH SourceTable AS(
SELECT R.ReleaseItem, R.Prop1, R.Prop2 , I.RelOrInc2 as [RelatedIncident] , Null as [RelatedChanges] FROM Release R
LEFT JOIN [Intermediate] I
ON R.ReleaseItem = I.RelOrInc1
WHERE SUBSTRING(I.RelOrInc2,1,3) = 'Inc'
UNION
SELECT R.ReleaseItem, R.Prop1, R.Prop2 , I.RelOrInc1 , Null as [RelatedChanges] FROM Release R
LEFT JOIN [Intermediate] I
ON R.ReleaseItem = I.RelOrInc2
WHERE SUBSTRING(I.RelOrInc1,1,3) = 'Inc'
UNION
SELECT R.ReleaseItem, R.Prop1, R.Prop2 , Null as [RelatedIncident] , I.RelOrInc2 as [RelatedChanges] FROM Release R
LEFT JOIN [Intermediate] I
ON R.ReleaseItem = I.RelOrInc1
WHERE SUBSTRING(I.RelOrInc2,1,3) = 'Chg'
UNION
SELECT R.ReleaseItem, R.Prop1, R.Prop2 , Null as [RelatedIncident] , I.RelOrInc1 as [RelatedChanges] FROM Release R
LEFT JOIN [Intermediate] I
ON R.ReleaseItem = I.RelOrInc2
WHERE SUBSTRING(I.RelOrInc1,1,3) = 'Chg'
)
SELECT REL.* , COUNT(S.RelatedIncident) As [No Of Related Incidents] , COUNT(S.[RelatedChanges]) AS [No of Related Changes] FROM Release REL
LEFT JOIN SourceTable S
ON REL.ReleaseItem = S.ReleaseItem
GROUP BY REL.ReleaseItem, REL.Prop1, REL.Prop2
This query gives me my required result:
But I think my way of handling this query is so naive and not efficient. My data warehouse may contain around millions of records in intermediate table and my approach could be too slow.
Question: Is there any better way to get such result with better performance?
BTW, I am using MS SQL Server 2012
SELECT
R.ReleaseItem, R.Prop1, R.Prop2,
[No Of Related Incidents] = (SELECT COUNT(*) FROM [Intermediate] i
WHERE (i.RelOrInc1 = r.ReleaseItem AND i.RelOrInc2 LIKE 'Inc%')
OR (i.RelOrInc2 = r.ReleaseItem AND i.RelOrInc1 LIKE 'Inc%')),
[No of Related Changes] = (SELECT COUNT(*) FROM [Intermediate] i
WHERE (i.RelOrInc1 = r.ReleaseItem AND i.RelOrInc2 LIKE 'Chg%')
OR (i.RelOrInc2 = r.ReleaseItem AND i.RelOrInc1 LIKE 'Chg%'))
FROM Release R