i have the following query in hive to get the counts per each of those columns (cluster, country and airline) as a percentage. But my percentage column contains only 0's.. why/what am i doing wrong below?
select
count(*)/ t.cnt * 100 AS percentage,
cluster,
country,
airline
from
table1
CROSS JOIN (SELECT COUNT(*) AS cnt FROM table1 ) t
GROUP
BY cluster,
country,
airline
First, you should use window functions.
Second, beware of integer division.
I would phrase this as:
select count(*) * 100.0 / sum(count(*)) over () AS percentage,
cluster, country, airline
from table1
group by cluster, country, airline;