Search code examples
hivesubqueryin-subquery

Hive : Multiple In-elements with single subquery


I was trying to run a query like this on hive (version 1.2.1) :

DELETE  FROM employee as e WHERE  (e.id, e.name) IN ( SELECT emp.id, emp.name FROM employee_final  emp) AND e.sno = 120 ;

But since hive doesn't support this construct so I tried this :

DELETE FROM employee WHERE 
id  IN (  SELECT emp.id  FROM employee_final AS  emp ) 
AND 
name IN ( SELECT emp.name  FROM employee_final AS  emp      )  
AND e.sno = 120 ;

But it seems there is a limitation in hive and it supports just 1 subquery and I am getting an exception like this

Error Message : Grammatically incorrect SQL Query :]; nested exception is org.apache.hive.service.cli.HiveSQLException: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 1:185 Unsupported SubQuery Expression 'name': Only 1 SubQuery expression is supported.

What is the most efficient way to handle this problem.


Solution

  • You can try following

    1. through EXISTS I am not sure about this form for subquery is supported in HIVE.

      DELETE FROM employee as e WHERE EXISTS ( SELECT 1 FROM employee_final emp where e.id = emp.id and e.name =  emp.name ) AND e.sno = 120 ;
      
    2. through CONCAT function

      DELETE FROM employee as e WHERE concat (e.id, '#' ,e.name) IN ( SELECT concat (emp.id,'#' ,emp.name) FROM employee_final emp) AND e.sno = 120 ;