I am attempting to subtract a value in a column (column_18
) from the average of another column (avg(column_19)
) and obtain this result as a third column (result
) for each row of the table:
cur.execute("Select avg(column_19) - column_18 as result FROM test1")
This doesn't seem to be working well, and I get this error:
impala.error.HiveServer2Error: AnalysisException: select list expression not produced by aggregation output (missing from GROUP BY clause?): SUM(column_19) / COUNT(column_19) - column_18
I do not want the result to be grouped
avg()
in this context is an aggregate function, which means that it is applied to a group of rows, which may specified with a GROUP BY
clause (or all rows if not specified). The output of an aggregate expression is a single value per-group, so it is not applied per-row as you want.
However, you can accomplish what you're trying to do in a few ways. I think the easiest is by using avg() as an analytic function. For example, you can do something like:
select column_19, column_18, (avg(column_19) over () - column_18) as result from test1
See the documentation for more details about how aggregations and analytic functions work.