Search code examples
sql-serverunionaggregate-functionsdata-partitioning

Aggregate sum on two columns in the same table


I am querying a data warehouse (so I cannot redesign tables), I'll do my best to mimic the scenario in a simple example.

We have 3 main tables for incident, change, and release. These 3 are connected through an intermediate table called intermediate. Here is their structure along with sample data:

Incident Table:

enter image description here

Change Table:

enter image description here

Release Table:

enter image description here

Intermediate Table:

enter image description here

The first 3 tables have exact same structure, but the intermediate table holds connection of these 3 tables pairwise. For example, if Rel1 is connected to Chg1, you have a row in the intermediate table as or . These two rows have no difference and may not co-exist.

QUERY:

I want ALL release records along with number of related incidents and number of related changes. Here is how I achieved this:

 WITH SourceTable AS(
SELECT R.ReleaseItem, R.Prop1, R.Prop2 , I.RelOrInc2 as [RelatedIncident] , Null as [RelatedChanges] FROM Release R
    LEFT JOIN [Intermediate] I
    ON R.ReleaseItem = I.RelOrInc1
    WHERE SUBSTRING(I.RelOrInc2,1,3) = 'Inc'
UNION

SELECT R.ReleaseItem, R.Prop1, R.Prop2 , I.RelOrInc1 , Null as [RelatedChanges] FROM Release R
    LEFT JOIN [Intermediate] I
    ON R.ReleaseItem = I.RelOrInc2
    WHERE SUBSTRING(I.RelOrInc1,1,3) = 'Inc'

UNION

SELECT R.ReleaseItem, R.Prop1, R.Prop2 , Null as [RelatedIncident] , I.RelOrInc2 as [RelatedChanges] FROM Release R
    LEFT JOIN [Intermediate] I
    ON R.ReleaseItem = I.RelOrInc1
    WHERE SUBSTRING(I.RelOrInc2,1,3) = 'Chg'
UNION

SELECT R.ReleaseItem, R.Prop1, R.Prop2 , Null as [RelatedIncident] , I.RelOrInc1  as [RelatedChanges] FROM Release R
    LEFT JOIN [Intermediate] I
    ON R.ReleaseItem = I.RelOrInc2
    WHERE SUBSTRING(I.RelOrInc1,1,3) = 'Chg'
    ) 

SELECT REL.* , COUNT(S.RelatedIncident) As [No Of Related Incidents] , COUNT(S.[RelatedChanges]) AS [No of Related Changes] FROM Release REL 
    LEFT JOIN SourceTable S
    ON REL.ReleaseItem = S.ReleaseItem
GROUP BY REL.ReleaseItem, REL.Prop1, REL.Prop2

This query gives me my required result:

enter image description here

But I think my way of handling this query is so naive and not efficient. My data warehouse may contain around millions of records in intermediate table and my approach could be too slow.

Question: Is there any better way to get such result with better performance?

BTW, I am using MS SQL Server 2012


Solution

  • SELECT   
      R.ReleaseItem, R.Prop1, R.Prop2,
      [No Of Related Incidents] = (SELECT COUNT(*) FROM [Intermediate] i 
             WHERE (i.RelOrInc1 = r.ReleaseItem AND i.RelOrInc2 LIKE 'Inc%') 
             OR (i.RelOrInc2 = r.ReleaseItem AND i.RelOrInc1 LIKE 'Inc%')),
      [No of Related Changes] = (SELECT COUNT(*) FROM [Intermediate] i 
             WHERE (i.RelOrInc1 = r.ReleaseItem AND i.RelOrInc2 LIKE 'Chg%') 
             OR (i.RelOrInc2 = r.ReleaseItem AND i.RelOrInc1 LIKE 'Chg%'))
    FROM Release R