I'm trying to figure out a work around for the fact HIVE doesn't support correlated subqueries. Ultimately, I've been counting how many items exist in the data each week over the last month, and now I want to know how many items dropped out this week, came back, or are totally new. Wouldn't be too hard if I could use a where subquery but I'm having a tough time thinking of a work around without it.
Select
count(distinct item)
From data
where item in (Select item from data where date <= ("2016-05-10"))
And date between "2016-05-01" and getdate()
Any help would be great. Thank you.
Work around is left join with two result set and where second result set column is null.
ie
Select count (a.item)
from
(select distinct item from data where date between "2016-05-01" and getdate()) a
left join (Select distinct item from data where date <= ("2016-05-10")) b
on a.item =b.item
and b.item is null