I have the impala query:
select id,zip,income
from zipcode_incomes
group by income
having income>avg(income)
but i am getting error, AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?). What mistake did I make?
You cannot have non-aggregated columns in the select
. Presumably, you want something like this:
select id, zip, income
from (select zi.*, avg(zi.income) over () as avg_income
from zipcode_incomes zi
) zi
where income > avg_income;
This returns zip codes whose income is larger than the overall average.