Search code examples
sqlsql-servergreatest-n-per-groupnested-sets

SQL conditional variables in select query


I'm trying to write an SQL query that uses the Nested Interval Hierarchy database model.

--Given a parent Id, this query retrieves the position of the youngest child which can be inserted into the table
SELECT TOP 1
    --compute values based on the youngest "sibling id" (the one with the highest value)
    (parent.m_11 * (FLOOR(child.m_11/child.m_12)+2) - parent.m_12) as m_11,
    (parent.m_11) as m_12, 
    (parent.m_21 * (FLOOR(child.m_11/child.m_12)+2) - parent.m_22) as m_21,
    (parent.m_21) as m_22
    FROM my_so_table child
        --grabs all children of the parent
        JOIN my_so_table parent
          ON parent.Id = 1
         AND parent.m_21 = child.m_22
         AND parent.m_11 = child.m_12
    --the operation "Floor(child.m_11 / child.m_12) is the siblingId that I need to do math on above
    ORDER By FLOOR(child.m_11/child.m_12) DESC
GO

With schema:

GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO

CREATE TABLE [dbo].[my_so_table](
    [m_11] [int] NOT NULL,
    [m_12] [int] NOT NULL,
    [m_21] [int] NOT NULL,
    [m_22] [int] NOT NULL,
    [Id] [int] IDENTITY(1,1) NOT NULL)
GO

INSERT INTO [dbo].[my_so_table] VALUES (2,1,1,0); --1.
INSERT INTO [dbo].[my_so_table] VALUES (3,1,1,0); --2.
INSERT INTO [dbo].[my_so_table] VALUES (3,2,2,1); --1.1
INSERT INTO [dbo].[my_so_table] VALUES (4,1,1,0); --3.

With the above schema, running the above query with parentId of 1 properly returns

5,2,3,1

which correctly represents the matrix of the new node to be inserted under the parent with Id of 1. Running the above query on parentId of 2 returns an empty list, when it should be returning

5,3,2,1

Representing the matrix of the first child under the parent with Id of 2.

This is wrong because my query is not handling the case that there are no children of parent. If there are no children of parent, the result of Floor(child.m_11 / child.m_12) should be -1. How do I alter my query to accomplish this?


Solution

  • I found my solution. I was shying away from greatest-n-by-group when in reality I didn't understand how group-by worked.

    --Given a parent Id, this query retrieves the position of the youngest child which can be inserted into the table
    --insert into my_so_table(m_11, m_12, m_21, m_22)
    SELECT TOP 1
        --compute values based on the youngest 'sibling id' (the one with the highest value)
        (parent.m_11 * (ISNULL(siblingId, 0) + 2) - parent.m_12) as m_11,
        (parent.m_11) as m_12, 
        (parent.m_21 * (ISNULL(siblingId, 0) + 2) - parent.m_22) as m_21,
        (parent.m_21) as m_22
        FROM my_so_table parent
            --grabs all children of the parent
            LEFT JOIN (
                --Grabs the youngest sibling for each sibling chain
                SELECT
                    child.m_12,
                    child.m_22,
                    Max(Floor(child.m_11 / child.m_12)) as siblingId
                FROM my_so_table child
                Group By child.m_12, child.m_22
            ) child
            on(parent.m_21 = child.m_22)
            AND(parent.m_11 = child.m_12)
        WHERE parent.Id = @parentId
        ORDER By siblingId DESC
    GO
    

    group-by wasn't working before because I couldn't retrieve m_12 and m_22 from the child query, because I wasn't group bying both values. Switching to left-join causes null values to be reported, which is exactly what I needed!