Search code examples
sqlsql-servert-sql

GROUP BY clause with a sum of zero (and non-zero) data


I am expanding on the post that I posted here.

In the previous post, I stated that I have a table with 20 DEFECTID fields along with 20 DEFECTCNT fields that come from a third-party company that my company purchased. With help from users, I used this query to get the result set that I want.

WITH Unpivoted AS
(
    SELECT
        OC_DDATA_PC.PARTNO,
        CAST(OC_DDAT_AUX_PC.UDL40 AS datetime) AS SAMPLEDATE,
        stagingPLM.DBO.PLANTS.PLANT_NAME,
        OC_DDATA_PC.UDL2 as SHIFT,
        OC_DDATA_PC.UDL4 as SIGNON_ID,
        OC_DDATA_PC.UDL6 as LINE_NUMBER,
        OC_DDAT_AUX_PC.UDL12 as CAVITY,
        OC_DDAT_AUX_PC.UDL27 as SPEC_TYPE,
        OC_DDAT_AUX_PC.UDL28 as PROPERTY_TREE,
        OC_DDAT_AUX_PC.UDL30 as DEFECT_TYPE,
        OC_DDATA_PC.SSIZE,
        ColName,
        ColID,
        ColValue
    FROM            
        OC_DDATA_PC 
    INNER JOIN
        OC_DDAT_AUX_PC ON OC_DDATA_PC.PARTNO = OC_DDAT_AUX_PC.PARTNOAUX AND OC_DDATA_PC.DATETIME = OC_DDAT_AUX_PC.DATETIMEAUX 
    INNER JOIN
        stagingPLM.dbo.PLANTS ON OC_DDATA_PC.UDL1 = stagingPLM.dbo.PLANTS.PLANT_CODE
    CROSS APPLY (
        VALUES
        ('DEFECTID1',OC_DDATA_PC.DEFECTID1,OC_DDATA_PC.DEFECTCNT1),
        ('DEFECTID2',OC_DDATA_PC.DEFECTID2,OC_DDATA_PC.DEFECTCNT2),
        ('DEFECTID3',OC_DDATA_PC.DEFECTID3,OC_DDATA_PC.DEFECTCNT3),
        ('DEFECTID4',OC_DDATA_PC.DEFECTID4,OC_DDATA_PC.DEFECTCNT4),
        ('DEFECTID5',OC_DDATA_PC.DEFECTID5,OC_DDATA_PC.DEFECTCNT5),
        ('DEFECTID6',OC_DDATA_PC.DEFECTID6,OC_DDATA_PC.DEFECTCNT6),
        ('DEFECTID7',OC_DDATA_PC.DEFECTID7,OC_DDATA_PC.DEFECTCNT7),
        ('DEFECTID8',OC_DDATA_PC.DEFECTID8,OC_DDATA_PC.DEFECTCNT8),
        ('DEFECTID9',OC_DDATA_PC.DEFECTID9,OC_DDATA_PC.DEFECTCNT9),
        ('DEFECTID10',OC_DDATA_PC.DEFECTID10,OC_DDATA_PC.DEFECTCNT10),
        ('DEFECTID11',OC_DDATA_PC.DEFECTID11,OC_DDATA_PC.DEFECTCNT11),
        ('DEFECTID12',OC_DDATA_PC.DEFECTID12,OC_DDATA_PC.DEFECTCNT12),
        ('DEFECTID13',OC_DDATA_PC.DEFECTID13,OC_DDATA_PC.DEFECTCNT13),
        ('DEFECTID14',OC_DDATA_PC.DEFECTID14,OC_DDATA_PC.DEFECTCNT14),
        ('DEFECTID15',OC_DDATA_PC.DEFECTID15,OC_DDATA_PC.DEFECTCNT15),
        ('DEFECTID16',OC_DDATA_PC.DEFECTID16,OC_DDATA_PC.DEFECTCNT16),
        ('DEFECTID17',OC_DDATA_PC.DEFECTID17,OC_DDATA_PC.DEFECTCNT17),
        ('DEFECTID18',OC_DDATA_PC.DEFECTID18,OC_DDATA_PC.DEFECTCNT18),
        ('DEFECTID19',OC_DDATA_PC.DEFECTID19,OC_DDATA_PC.DEFECTCNT19),
        ('DEFECTID20',OC_DDATA_PC.DEFECTID20,OC_DDATA_PC.DEFECTCNT20)

    ) AS Unpvt(ColName, ColID, ColValue)

    -- Currently set up for only for Pulp Plates or Pulp Bowls.
    -- Change the Property Tree (or even a different field) to add/remove records.
    WHERE
        OC_DDAT_AUX_PC.UDL28 LIKE 'PULP %'
)
SELECT *
FROM Unpivoted

Now, I need to take it one step farther. I need to GROUP BY the SAMPLEDATE, PARTNO, LINE_NUMBER, CAVITY, and the SIGNON_ID. The second to last column is the DEFECTID that I can do a JOIN on to get the description value. A zero (0) represents that there is no defect.

Every query that I can think of continues to return twenty rows. The above query returns this which is what (I think) I want:

PARTNO SAMPLEDATE PLANT_NAME SHIFT SIGNON_ID LINE_NUMBER CAVITY SPEC_TYPE PROPERTY_TREE DEFECT_TYPE SSIZE DEFECTNAME DERFECTID DEFECTCOUNT
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID1 1683 10
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID1 1683 15
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID2 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID2 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID3 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID3 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID4 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID4 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID5 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID5 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID6 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID6 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID7 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID7 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID8 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID8 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID9 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID9 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID10 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID10 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID11 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID11 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID12 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID12 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID13 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID13 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID14 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID14 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID15 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID15 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID16 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID16 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID17 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID17 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID18 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID18 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID19 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID19 0 0
12FBRB 2024-02-14 12:23:21.790 Mason Fiber 2 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 10 DEFECTID20 0 0
12FBRB 2024-02-13 16:08:40.280 Mason Fiber 3 D925512 FORMER 1 BOWLS PULP BOWLS CONSTRUCTION 15 DEFECTID20 0 0

The result I am looking for would be this (example):

PARTNO SAMPLEDATE PLANT_NAME SHIFT SIGNON_ID LINE_NUMBER CAVITY SPEC_TYPE PROPERTY_TREE DEFECT_TYPE SSIZE DEFECTNAME DERFECTID DEFECTCOUNT
9FBRP1 12/19/2023 09:36:45.51 Mason Fiber 2 D928099 FORMER 3 PLATES PULP PLATES WATER RESISTANCE 24 DEFECTID1 1681 1
9FBRP1 12/19/2023 09:40:38.66 Mason Fiber 2 D928099 FORMER 4 PLATES PULP PLATES GREASE RESISTANCE 24 DEFECTID1 0 0
9FBRP1 12/19/2023 09:47:49.57 Mason Fiber 2 D928099 FORMER 3 PLATES PULP PLATES CONSTRUCTION 24 DEFECTID1 0 0
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 15 PLATES PULP PLATES CONSTRUCTION 2 DEFECTID1 0 0
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 25 PLATES PULP PLATES CONSTRUCTION 1 DEFECTID1 1102 1
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 25 PLATES PULP PLATES CONSTRUCTION 1 DEFECTID2 1683 1
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 25 PLATES PULP PLATES CONSTRUCTION 1 DEFECTID3 1096 1
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 25 PLATES PULP PLATES CONSTRUCTION 1 DEFECTID4 1319 1
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 26 PLATES PULP PLATES CONSTRUCTION 2 DEFECTID1 0 0
9FBRP1 02/21/2024 08:27:45.68 Mason Fiber 2 D925512 FORMER 2 28 PLATES PULP PLATES CONSTRUCTION 2 DEFECTID1 0 0

This is the output that I am looking for. I still need to report on if no defects were found out of the twenty fields available and their corresponding defect counts.

I am getting pretty defeated here. Is what I am asking for possible?

I know that the tables are not optimized but I cannot change them. I apologize for posting twice on the same subject but I need to come through.


Solution

  • The easiest to solve it based by previous question is adding a ROW_NUMBER column and then filter by it:

    SELECT  *
    INTO #OC_DDATA_PC
    FROM    (
        VALUES  (N'10B20', 2022011111314430, '', 12, 91, 0, 0, 1, 0, 0)
        ,   (N'10B20', 2022011209511423, '', 10, 91, 0, 0, 1, 0, 0)
        ,   (N'10B20', 2022011315280083, '', 10, 97, 0, 0, 1, 0, 0)
        ,   (N'10B20', 2022011411403882, '', 10, 81, 91, 0, 1, 1, 0)
        ,   (N'10B20', 2022011823363688, '', 10, 91, 0, 0, 1, 0, 0)
        ,   (N'10B20', 2022011823363689, '', 10, 0, 0, 0, 0, 0, 0)
    ) t (PARTNO,DATETIME,CAVITY,SSIZE,DEFECTID1,DEFECTID2,DEFECTID3,DEFECTCNT1,DEFECTCNT2,DEFECTCNT3)
    
    
    SELECT  *
    INTO #OC_DMDL_PC
    FROM    (
        VALUES  
            (81, N'MOLD BUILD-UP')
        ,   (91, N'PITTED SURFACE')
        ,   (97, N'RIDER')
    ) t (DEFECTID,DESCRIPT)
    
    select *
    from (
        SELECT  d.PARTNO, d.DATETIME, CAVITY, SSIZE, def.*, defDesc.*
        -- New!
        ,   ROW_NUMBER() OVER(PARTITION BY d.PARTNO, d.DATETIME, d.CAVITY, SSIZE ORDER BY -defId) AS defectRank
        FROM    #OC_DDATA_PC d
        CROSS APPLY (
            VALUES  (DEFECTID1, DEFECTCNT1, 1)
            ,   (DEFECTID2, DEFECTCNT2, 2)
            ,   (DEFECTID3, DEFECTCNT3, 3)
            ) def (defId, defCount, def#)
        LEFT JOIN #OC_DMDL_PC defDesc
            ON  defDesc.DEFECTID = def.defId
          ) x
    WHERE defId > 0 OR (defectRank = 1 AND defId = 0) -- NEW
    

    Explanation:

    Since you want to get all defective rows we add WHERE defId > 0 to the selection.

    To handle displaying one non-defective row, i add a ROW_NUMBER counter that basically prioritizes defective rows first. So, if first row is non-defective, it means all the other rows are non-defective either, so then we only fetch defectRank = 1.

    ROW_NUMBER() OVER(PARTITION BY d.PARTNO, d.DATETIME, d.CAVITY, SSIZE ORDER BY -defId) AS defectRank - partition by part is the grouping you're looking for, -defID will sort defective rows first.

    Note that ROW_NUMBER cannot be used directly in a WHERE clause, so you have to wrap whole query in a subquery