I am trying to use the window function count distinct in hive, and getting error.
The query used is as follows:
select
user
,dt
,count(distinct dt) over(partition by user) as dates
from table
It throws the following error:
FAILED: SemanticException [Error 10025]: Line 1:123 Expression not in GROUP BY key 'user'
I think I am using the syntax as specified in the hive language manual
What exactly am I doing wrong?
Another solution would be to use a combination of size
and collect_set
:
select
user,
dt,
size(collect_set(dt) over(partition by user)) as dates
from
table