How do I combine rows while removing duplicate rows?
I tried UNION
by itself as well as UNION DISTINCT
but both returned error messages in Hue.
error while compiling statement: failed: parseexception line 5:10 mismatched input 'distinct' expecting all near 'union' in set operator
SELECT DISTINCT(product1.user)
FROM product1
UNION
SELECT DISTINCT(product2.user)
FROM product2
UNION
SELECT DISTINCT(product3.user)
FROM product3
As answered by @Andrew in the comments above:
If you are on a Hive version prior to 1.2, only union all
is supported. So you'll have to use union all
, and wrap an outer query around it for the distinct:
select distinct user from (select user from product1 union all select user from product2...) t