Search code examples
sql-servert-sqljoinpivotcoalesce

Multi Join Tsql Coalesce Inefficient Code


Need Help making code more efficient.

I have the following select (which will become an update statement) and right now I have to have 8 joins of exactly the same data. I tried a few things and failed but maybe the collective wisdom here will have some insight as to how to make it more efficient.

I have the following tables

#temp_data 
customer    season  id      percent
1           2016    835     35.93000    
1           2016    836     31.82000    
1           2016    837     11.88000    
1           2016    841     5.09000 
1           2016    839     15.28000    
1           2017    836     31.40000
1           2017    837     11.45000    
1           2017    839     14.04000    
1           2017    841     6.40000 
1           2017    835     36.70000

and

lt_program_data

customer season     value_835   value_836   value_837   value_838   value_839   value_840   value_841   value_842
1       2016        35.93000    31.82000    11.88000    NULL        15.28000    NULL        5.09000     NULL
1       2017        36.70000    31.40000    11.45000    NULL        14.04000    NULL        6.40000     NULL
1       2018        NULL        NULL        NULL        NULL        NULL        NULL        NULL        NULL
1       2019        NULL        NULL        NULL        NULL        NULL        NULL        NULL        NULL

What is happening, is that we match the two tables on season/customer/id values. In the scenarios where no value exist in the #temp_data table we use max value that exists. I was able to sort out the code to get it working. However, its large and ugly and not terribly efficient.

I either have to have 8 update statements where i explicitly state each id value (835, 836, etc). Or one large statement with 8 Joins on practically the same thing. Any advice on how to update the below for improved efficiency would be great.

As you can see the below is really the repetition of the exact same thing over and over again. While i understand the select part may require the 8 separate statements, is there a way to eliminate the 8 left outer joins to be one only.

Code:

select      a.customer, 
            fyear,
            value_835 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 835),    
                                        c1.[percent] ),

            value_836 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 836),    
                                        c2.[percent] ),

            value_837 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 837),    
                                        c3.[percent] ),

            value_838 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 838),    
                                        c4.[percent] ),

            value_839 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 839),    
                                        c5.[percent] ),

            value_840 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 840),    
                                        c6.[percent] ),

            value_841 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 841),    
                                        c7.[percent] ),

            value_842 = coalesce (( select [percent] from #temp_data d where d.season = a.fyear and d.customer = a.customer and d.id = 842),    
                                        c8.[percent] )
from        #lt_program_data a
left outer join (   --835
                    select a.customer, a.id, a.[percent] 
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 835
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c1 on        a.customer = c1.customer 

left outer join (   --836
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 836
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c2 on        a.customer = c2.customer 

left outer join (   --837
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 837
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c3 on        a.customer = c3.customer 

left outer join (   --838
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 838
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c4 on        a.customer = c4.customer 

left outer join (   --839
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 839
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c5 on        a.customer = c5.customer 

left outer join (   --840
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 840
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c6 on        a.customer = c6.customer 


left outer join (   --841
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 841
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c7 on        a.customer = c7.customer 


left outer join (   --842
                    select a.customer, a.id, a.[percent]
                    from    #temp_data a
                    join    (   select z.customer, id, season =  max(z.season)
                                from    #temp_data z 
                                where   id = 842
                                group by z.customer, id 
                            )   b on a.customer = b.customer and a.season = b.season  and a.id = b.id 
                )  c8 on        a.customer = c8.customer 

This is the sample output at the end.

customer    fyear   value_835   value_836   value_837   value_838   value_839   value_840   value_841   value_842
1           2016    35.93000    31.82000    11.88000    NULL        15.28000    NULL        5.09000     NULL
1           2017    36.70000    31.40000    11.45000    NULL        14.04000    NULL        6.40000     NULL
1           2018    36.70000    31.40000    11.45000    NULL        14.04000    NULL        6.40000     NULL
1           2019    36.70000    31.40000    11.45000    NULL        14.04000    NULL        6.40000     NULL

Solution

  • I've found that wrapping my logical query blocks in a CTE helps me think more clearly about it. Here's one that pivots #temp_data instead of unpivot-ing lt_program_data. If I understood you correctly, you should be able to left outer join cross apply with LatestData to get what you need.

    ;with LatestData as (
        select Customer, Season, [835],[836],[837],[838],[839],[840],[841],[842] from 
        (select Customer, Season, [percent], id from temp_data td) as TempSource
        pivot 
        ( sum([percent]) for id in ([835],[836],[837],[838],[839],[840],[841],[842])) as TempPivoted
        )
    
     select a.customer,
        a.season,
        value_835 = coalesce(a.value_835, latest.[835]),
        value_836 = coalesce(a.value_836, latest.[836]),
        value_837 = coalesce(a.value_837, latest.[837]),
        value_838 = coalesce(a.value_838, latest.[838]),
        value_839 = coalesce(a.value_839, latest.[839]),
        value_840 = coalesce(a.value_840, latest.[840]),
        value_841 = coalesce(a.value_841, latest.[841]),
        value_842 = coalesce(a.value_842, latest.[842])
    from lt_program_data a
    cross apply (select top 1 * from LatestData where LatestData.customer = a.customer and latestdata.season <= a.season order by season desc) as latest