I wish to return a boolean value if there are duplicates in the table in Hive 0.9 For now, I'm doing this :
select cast(case when count(*) > 0 then 1 else 0 end as smallint) Validate_Value
from (
select guid, count(guid) cnt
from default.vms_feed v
group by guid
) where v.cnt > 1 ;
But this gives me an error:
"cannot recognize input near 'where' 'v' '.' in subquery source [DB Errorcode=11] "
I'm not sure where I'm going wrong or what I'm missing out here !
Correct syntax for Hive subquery is:
SELECT ... FROM (subquery) name ...
So in your case
SELECT CAST(...) Validate_Value
FROM (
...
) v WHERE v.cnt > 1 ;