Search code examples
sqlhivequery-optimizationhiveql

Need guidance to simplify this query


We have a huge query with many conditions. However, we feel that some of the conditions are irrelevant can you please let me know if where clause here, can be removed

Please find below excerpt of the huge query ( the sub-query) with where clause

SELECT f1,  ... f10 
FROM A
JOIN
SELECT f1, f2 ... f10
FROM B where PROC_DT IN (SELECT PROC_DATE FROM C)
ON A.ID = B.ID
WHERE ISNOTNULL(PROC_DT)

I think the query will already validate the PROC_DATE in getting data from table B. so can we remove the where clause here.

Can someone please confirm my findings


Solution

  • yes you are correct, because of the in it can't be null. I would use an inner join instead, in some cases it will be faster.

    SELECT f1,  ... f10 
    FROM A
    JOIN B ON A.ID = B.ID
    join (SELECT DISTINCT PROC_DATE FROM C) AS X ON B.PROC_DT = X.PROC_DATE