Search code examples
sqlsql-servergroup-bycasesql-server-2017

MSSQL group rows with different values by "updating" those values


Assume following data:

enter image description here

This data needs to be grouped to get this result:

enter image description here

As you can see, the id is simply aggregated with STRING_AGG(id, ' ') WITHIN GROUP (ORDER BY id ASC). The accountNums are a bit trickier. Only if a refNum has the 2 accountNums 10040 and 10041, then they should be replaced by the value 2. If any refNum has onl one of these accountNums, then they should not be replaced. The same goes for any other accountNum.

I tried following SQL query, but I can't group it properly:

SELECT STRING_AGG(f.id, ' ') WITHIN GROUP ( 
ORDER BY f.id ASC),
f.refNum,
CASE 
    WHEN EXISTS (   SELECT 1 
                    FROM Foo f2 
                    WHERE (f2.accountNum = 10040 
                        OR f2.accountNum = 10041) 
                        AND f2.id != f.id) 
    AND (f.accountNum = 10040 
    OR f.accountNum = 10041) 
    THEN 2 
    ELSE f.accountNum 
END AS accountNumUpdated,
SUM(f.value) AS val 
FROM Foo f 
GROUP BY f.refNum,
    accountNumUpdated

The issue with this query is that accountNumUpdated is not known in the GROUP BY clause. Also I can't move the CASE WHEN into the GROUP BY, as the subquery is not allowed there either.

As I couldn't find any way around these issues, the question is, how do I need to change the query to get the result I want?

The exact SQL server version is:

Microsoft SQL Server 2017 (RTM) - 14.0.1000.169 (X64) 
    Aug 22 2017 17:04:49 
    Copyright (C) 2017 Microsoft Corporation
    Standard Edition (64-bit) on Windows Server 2016 Standard 10.0 <X64> (Build 14393: ) (Hypervisor)

If anything is unclear, don't hesitate to ask


Solution

  • Try using cte on your result. Since you already computed your value

    ; WITH CTE AS (
    SELECT f.id,
    f.refNum,
    CASE 
        WHEN EXISTS (   SELECT 1 
                        FROM Foo f2 
                        WHERE (f2.accountNum = 10040 
                            OR f2.accountNum = 10041) 
                            AND f2.id != f1.id
                            AND f2.refNum = f1.refNum)   --- add this since you want to combine for one refno 
        AND (f.accountNum = 10040 
        OR f.accountNum = 10041) 
        THEN 2 
        ELSE f.accountNum 
    END) AS accountNumUpdated,
    f.value AS val 
    FROM Foo f group by f.id, f.refnum, f.value
    )
    SELECT STRING_AGG(id, ' ') WITHIN GROUP ( ORDER BY id ASC) , 
    max(refNum) as refNum,  max(accountNumUpdated) as accountNumUpdated , sum(val) as val 
    FROM CTE GROUP BY accountNumUpdated