I'm trying to normalize a column in hive, which means I have to make each value in the column divide max value in that column. For example: I have a table X:
A B C D
1 0.1 3 0.2
2 0.2 4 0.5
3 0.3 5 0.2
I want to normalize column C, the requested table Y is:
A B C D
1 0.1 0.6 0.2
2 0.2 0.8 0.5
3 0.3 1 0.2
I do not know how to write SQL to express the max value of the whole column.
max
window function
select a
,b
,c / max(c) over () as c
,d
from x
+---+-----+-----+-----+
| a | b | c | d |
+---+-----+-----+-----+
| 1 | 0.1 | 0.6 | 0.2 |
| 2 | 0.2 | 0.8 | 0.5 |
| 3 | 0.3 | 1 | 0.2 |
+---+-----+-----+-----+