Search code examples
oracle-databasequery-optimizationcommon-table-expressiondatabase-performance

When do we use WITH clause, and what are main benefits of it?


I was working on task about optimization queries. One of the improvement ways was using WITH clause. I notice that it did very good job, and it lead to shorter time of execution, but i am not sure now, when should I use WITH clause and is there any risk of using it?

Here is one of the queries that I am working on :

WITH MY_TABLE AS 
    (   SELECT PROD_KY,
               sum(GROUPISPRIVATE) AS ISPRIVATE,
               sum(GROUPISSHARED) AS ISSHARED
        FROM
             (
                SELECT GRP_PROD_CUSTOMER.PROD_KY,
                       1 as ISPRIVATE,
                       0 as ISSHARED
                FROM CUSTOMER
                JOIN GRP_CUSTOMER ON GRP_CUSTOMER.CUST_KY = CUSTOMER.CUST_KY
                JOIN GRP_PROD_CUSTOMER ON GRP_PROD_CUSTOMER.GRP_KY = GRP_CUSTOMER.GRP_KY                                                                                                                                                                                                                                                                                                            
                GROUP BY GRP_PROD_CUSTOMER.PROD_KY
             ) 
   GROUP BY PROD_KY
)
SELECT * FROM MY_TABLE;

Solution

  • is there any risk of using it?

    Yes. Oracle may decide to materialize the subquery, which means writing its result set to disk and then reading it back (except it might not mean that in 12cR2 or later). That unexpected I/O could be a performance hit. Not always, and usually we can trust the optimizer to make the correct choice. However, Oracle has provided us with hints to tell the optimizer how to handle the result set: /*+ materialize */ to um materialize it and /*+ inline */ to keep it in memory.

    I start with this potential downside because I think it's important to understand that the WITH clause is not a silver bullet and it won't improve every single query, and may even degrade performance. For instance I share the scepticism of the other commenters that the query you posted is in any way faster because you re-wrote it as a common table expression.

    Generally, the use cases for the WITH clause are:

    1. We want to use the result set from the subquery multiple times

      with cte as
        ( select blah from meh )
      select * 
      from t1
           join t2 on t1.id = t2.id
      where t1.col1 in ( select blah from cte )
      and   t2.col2 not in ( select blah from cte)
      
    2. We want to be build a cascade of subqueries:

      with cte as
        ( select id, blah from meh )
        , cte2 as 
         ( select t2.*, cte.blah
           from cte
                join t2 on t2.id = cte.id)
        , cte3 as 
          ( select t3.*, cte2.*
            from cte2
                 join t3 on t3.col2 = cte2.something ) 
         ….
      

    This second approach is beguiling and can be useful for implementing complex business logic in pure SQL. But it can lead to a procedural mindset and lose the power sets and joins. This too is a risk.

    1. We want to use recursive WITH clause. This allows us to replace Oracle's own CONNECT BY syntax with a more standard approach. Find out more

    2. In 12c and later we can write user-defined functions in the WITH clause. This is a powerful feature, especially for users who need to implement some logic in PL/SQL but only have SELECT access to the database. Find out more

    For the record I have seen some very successful and highly performative uses of the second type of WITH clause. However I have also seen uses of WITH when it would have been just as easy to write an inline view. For instance, this is just using the WITH clause as syntactic sugar ...

    with cte as
      ( select id, blah from meh )
    select t2.*, cte.blah 
    from  t2
          join cte on cte.id = t2.id
    

    … and would be clearer as ...

    select t2.*, cte.blah 
    from  t2
          join ( select id, blah from meh ) cte on cte.id = t2.id