Search code examples
sqlsql-serverquery-optimization

How to make SQL only read CTE referenced in IN clause once?


WITH RootItemIDs AS 
(
    SELECT DISTINCT ParentID 
    FROM C_Item 
    WHERE ParentID NOT IN (SELECT ItemID FROM C_Item)
)
SELECT COUNT(*)
FROM C_OrderHeader oh
JOIN C_OrderDetail od ON oh.OrderID = od.OrderID
JOIN C_Item i ON od.ItemID = i.ItemID
JOIN C_Item rootItem ON i.ParentID = rootItem.ItemID 
                     OR (i.ParentID IN (SELECT * FROM RootItemIDs) 
                         AND i.ItemID = rootItem.ItemID)

I'm using COUNT(*) for convenience here, but eventually the point will be to create a view with columns for the order item and show information about its parent item next to it.

Some C_Item rows have ParentID values that don't point to anything though, in which case it should be considered the "parent" (this is the reason I used the term "root" instead of "parent" in the CTE to be a little more accurate).

Here's the plan: https://www.brentozar.com/pastetheplan/?id=ByD4R3F0s. Notice the part in the attached picture where it is returning millions of rows for a table that only has 83 records because it is reading the table for every order detail. FYI, I tried with a temp table instead of a CTE and got the same issue.

Hardcoding values in the IN clause (in place of SELECT * FROM RootItemIDs) fixes the issue, of course, but obviously I'd prefer not to do that so I don't have to update the query in the future if more values or added.

How would you rewrite this to not scan C_Items repeatedly? (and preferably without temp tables if possible as I'd like to be able to use it in a view.)

EDIT: Forgot to add the aforementioned picture; added now.enter image description here


Solution

  • ORACLE
    Not having any insight into actual data (or RDBMS) you are working with, we could only try to create some imaginary sample data and try to answer the question. Assuming that your cte RootItemIDs should be used to filter just the ITEM_IDs with non existant PARENT_ID with the sample data (including your cte) similar to this:

    WITH
        order_header(ORDER_ID, ORDER_DATE, CUSTOMER_ID) AS 
            (
                Select 1, To_Date('01.02.2023', 'dd.mm.yyyy'), 101 From Dual Union ALL
                Select 2, To_Date('01.02.2023', 'dd.mm.yyyy'), 101 From Dual Union ALL
                Select 3, To_Date('01.02.2023', 'dd.mm.yyyy'), 101 From Dual 
            ),
        order_detail (ORDER_ID, ITEM_ID, QUANTITY, PRICE ) AS
            (
                Select 1,  'ITM_11', 10, 15 From Dual Union All
                Select 1,  'ITM_12', 12, 10 From Dual Union All
                -- 
                Select 2,  'ITM_21', 15, 12 From Dual Union All
                Select 2,  'ITM_22', 12, 19 From Dual Union All
                Select 2,  'ITM_23', 25, 17 From Dual Union All
                --
                Select 3,  'ITM_31', 15, 12 From Dual 
            ),
        item (ITEM_ID, PARENT_ID) AS
            (
                Select 'ITM_10', 'ITM_10' From Dual Union All
                Select 'ITM_11', 'ITM_10' From Dual Union All
                Select 'ITM_12', 'ITM_10' From Dual Union All
                --
                Select 'ITM_20', 'ITM_20' From Dual Union All
                Select 'ITM_21', 'ITM_20' From Dual Union All
                Select 'ITM_22', 'ITM_20' From Dual Union All
                Select 'ITM_23', 'ITM_20' From Dual Union All
                --
                Select 'ITM_30', 'ITM_30' From Dual Union All
                Select 'ITM_31', 'ITM_35' From Dual   -- ITM_35 non existing PARENT_ID
            ),
    --
        RootItemIDs AS 
            (
                SELECT DISTINCT PARENT_ID 
                FROM item 
                WHERE PARENT_ID NOT IN (SELECT ITEM_ID FROM item)
            )
    

    With this kind of data you don't need the last join in your SQL - just move the select condition using your cte to the previous JOIN. The code would be like below:

    SELECT  Count(*) "COUNT_OF"
    FROM    order_header oh
    JOIN    order_detail od ON (oh.ORDER_ID = od.ORDER_ID)
    JOIN    item i ON (od.ITEM_ID = i.ITEM_ID And i.PARENT_ID IN(SELECT * FROM RootItemIDs) )
    

    ... with sample data from above this would return:

      COUNT_OF
    ----------
             1
    

    ... and if you change your Select clause then:

    --  SELECT  i.ITEM_ID, i.PARENT_ID
    ITEM_ID PARENT_ID
    ------- ---------
    ITM_31  ITM_35   
    
    --  Select * 
      ORDER_ID ORDER_DATE CUSTOMER_ID   ORDER_ID ITEM_ID   QUANTITY      PRICE ITEM_ID PARENT_ID
    ---------- ---------- ----------- ---------- ------- ---------- ---------- ------- ---------
             3 01-FEB-23          101          3 ITM_31          15         12 ITM_31  ITM_35   
    

    Addition after comments
    still missing sample data and expected result, so still using above sample data and maybe this could help (pretty sure it could be adjusted to sql-server too):

    SELECT  i.ITEM_ID, 
            CASE WHEN i.PARENT_ID IN(SELECT * FROM RootItemIDs) THEN i.ITEM_ID || ' - self id' ELSE i.PARENT_ID END "PARENT_OR_SELF_ID"
    FROM    order_header oh
    JOIN    order_detail od ON (oh.ORDER_ID = od.ORDER_ID)
    JOIN    item i ON (od.ITEM_ID = i.ITEM_ID)
    
    --  R e s u l t :
    ITEM_ID PARENT_OR_SELF_ID
    ------- -----------------
    ITM_11  ITM_10            
    ITM_12  ITM_10            
    ITM_21  ITM_20            
    ITM_22  ITM_20            
    ITM_23  ITM_20            
    ITM_31  ITM_31 - self id
    

    ADDITION 2
    I added column DESCRIPTION in item table:

        item (ORDER_ID, ITEM_ID, PARENT_ID, DESCRIPTION) AS
            (
                Select 1, 'ITM_10', Null, 'I am a parent 10' From Dual Union All
                Select 1, 'ITM_11', 'ITM_10', 'Child 11' From Dual Union All
                Select 1, 'ITM_12', 'ITM_10', 'Child 12' From Dual Union All
                --
                Select 2, 'ITM_20', Null, 'I am a parent 20' From Dual Union All
                Select 2, 'ITM_21', 'ITM_20', 'Child 21' From Dual Union All
                Select 2, 'ITM_22', 'ITM_20', 'Child 22' From Dual Union All
                Select 2, 'ITM_23', 'ITM_20', 'Child 23' From Dual Union All
                --
                Select 3, 'ITM_30', Null, 'I am a parent 30' From Dual Union All
                Select 3, 'ITM_31', 'ITM_35', 'Child 31' From Dual 
            )
    

    ... and now making another join to item table to get the DESCRIPTION

    SELECT  i.ITEM_ID, 
            CASE WHEN i.PARENT_ID IN(SELECT * FROM RootItemIDs) THEN i.ITEM_ID || ' - self id' ELSE i.PARENT_ID END "PARENT_OR_SELF_ID",
            i2.DESCRIPTION
    FROM    order_header oh
    JOIN    order_detail od ON (oh.ORDER_ID = od.ORDER_ID)
    JOIN    item i ON (od.ITEM_ID = i.ITEM_ID)
    JOIN    item i2 ON(i2.ITEM_ID = CASE WHEN i.PARENT_ID IN(SELECT * FROM RootItemIDs) THEN i.ITEM_ID ELSE i.PARENT_ID END)
    
    
    ITEM_ID PARENT_OR_SELF_ID DESCRIPTION    
    ------- ----------------- ----------------
    ITM_11  ITM_10            I am a parent 10 
    ITM_12  ITM_10            I am a parent 10 
    ITM_21  ITM_20            I am a parent 20 
    ITM_22  ITM_20            I am a parent 20 
    ITM_23  ITM_20            I am a parent 20 
    ITM_31  ITM_31 - self id  Child 31       
    

    Does this help?