Search code examples
sqlsql-servergroup-bymaster-detail

Query to count how often the same detail data is stored in master - detail


I am a bit stuck with building a query to count how often the same detail data is stored in a master-detail relationship in a ms sql server database.

I got a master table, which is not that much important.

MASTER_ID 
1  
2  
3
4

What's important is the following detail table.

DID MASTERID No  Value
1     1        1   220
2     1        2   250
3     1        3   250        
4     2        1   220
5     2        2   250
6     3        1   220
7     3        2   250
8     4        1   220
9     4        2   230

The detail table has n rows, the column No (int) is a sequential number for the same masterid. The value is a decimal value. I would like to group and count the rows that I know how often the exact same detail data is used for the master rows. Definition of the same data: The same number of detail rows and each detail row must be the same (columns no and value).

The wanted result:

Count  No  Value
1      1   220   (DID 1)
1      2   250   (DID 2)
1      3   250   (DID 3)        
2      1   220   (DID 4 and 6)
2      2   250   (DID 5 and 7)
1      1   220   (DID 8)
1      2   230   (DID 9)

The first three rows have count 1 because there is no other master row with exactly these three no and value details. The next two rows have count 2 because there are two master rows which have 1-220 and 2-250 as detail rows. The last two rows have count one because there is only one master which has 1-220 and 2-230 as detail rows.

It's important to understand that for example

No  Value
1 150
2 170

and

No  Value
1 170 
2 150

is not seen as same detail data because of the different order (column no).

Any ideas?


Solution

  • I think I get this now, I think what you are after is comparing MasterID's that have the same value for all rows. This means your first step is to compare MasterID's to see if they are comparable. The best way to do this is to concatenate all rows together for each Master ID, building an out put like:

    MASTER_ID   | No:Value
    1           | [1:220][2:250][3:250]
    2           | [1:220][2:250]  
    3           | [1:220][2:250]
    4           | [1:220][2:230]
    

    Here you can see that 3 and 4 are comparable because they have the same result for the No:Value column.

    This can be achieved using SQL Server's XML extensions:

    SELECT  m.MasterID,
            NoValue = ( SELECT  QUOTENAME(CAST(No AS VARCHAR(10)) 
                                    + ':' + CAST(Value AS VARCHAR(10)))
                        FROM    Detail AS d
                        WHERE   d.MasterID = m.MasterID
                        FOR XML PATH(''), TYPE
                       ).value('.', 'VARCHAR(MAX)')
    FROM    Master AS m;
    

    Then you can join this output to your details table giving:

    DID MASTERID No  Value NoValue
    1     1        1   220 [1:220][2:250][3:250]
    2     1        2   250 [1:220][2:250][3:250]
    3     1        3   250 [1:220][2:250][3:250]
    4     2        1   220 [1:220][2:250]
    5     2        2   250 [1:220][2:250]
    6     3        1   220 [1:220][2:250]
    7     3        2   250 [1:220][2:250]
    8     4        1   220 [1:220][2:230]
    9     4        2   230 [1:220][2:230]
    

    Then it is just a case of counting grouping by No, Value and NoValue:

    WITH MasterValue AS
    (   SELECT  m.MasterID,
                NoValue = ( SELECT  QUOTENAME(CAST(No AS VARCHAR(10)) 
                                        + ':' + CAST(Value AS VARCHAR(10)))
                            FROM    Detail AS d
                            WHERE   d.MasterID = m.MasterID
                            FOR XML PATH(''), TYPE
                           ).value('.', 'VARCHAR(MAX)')
        FROM    Master AS m
    ), GroupedDetail AS
    (   SELECT  d.DID, d.MasterID, d.No, d.Value, mv.NoValue
        FROM    Detail AS d
                INNER JOIN MasterValue AS mv
                    ON mv.MasterID = d.MasterID
    )
    SELECT  Count = COUNT(*), No, Value
    FROM    GroupedDetail
    GROUP BY No, Value, NoValue;
    

    This is the step by step process, however, you can simplify the above into this:

    SELECT  Count = COUNT(*), d.No, d.Value
    FROM    Detail AS d
            CROSS APPLY
            (   SELECT  CAST(No AS VARCHAR(10)) + ':' + CAST(Value AS VARCHAR(10)) + ','
                FROM    Detail AS nv
                WHERE   nv.MasterID = d.MasterID
                FOR XML PATH(''), TYPE
            ) AS nv (NoValue)
            CROSS APPLY (SELECT nv.NoValue.value('.', 'VARCHAR(MAX)')) AS nv2 (NoValue)
    GROUP BY d.No, d.Value, nv2.NoValue
    ORDER BY MIN(d.DID);