Search code examples
impala

Would view or with clause of impala just compute once and be used multiple times in a query?


with core as (
select
    t1.a,
    t1.b,
    t2.c
from
    test_1 t1
join
    test_2 t2 on t1.a = t2.a
where
    t1.b = 'test' 
)
,tmp_1 as (
select a,count(1) from core group by a
)
,tmp_2 as (
select b,count(1) from core group by b
)
select
    t1.a,
    t1.count,
    t2.count
 from
     tmp_1 t1
 join
     tmp_2 t2 on t1.a=t2.b

My question is, does the core CTE in above query just compute once? or twice in both the tmp_1 and tmp_2 CTEs,I could not find clues in the documentation of impala


Solution

  • As of now, with present versions of Impala and Hive, core CTE in your query will be computed 2 times, as you are referring it in tmp_1 and tmp_2.

    You can observe it in the EXPLAIN PLAN of the query.

    Conclusion is that, you can use WITH clause for better debugging and maintainability of the complex queries. It doesn't help in increasing the performance of the Hive or Impala Query. As per the Hive JIRA site, there is no route map to include this recursive feature down the line.

    Oracle and PostgreSQL supports this recursive feature.