I have a table with users and either activity for each week, for example this is the user_activity
table:
userid | wk1 | wk2 | wk3
u1 | 1 | 0 | 1
u2 | 0 | 1 | 0
u3 | 1 | 0 | 1
I would like to transform it to:
week | active
wk1 | 2
wk2 | 1
wk3 | 2
I can achieve this using UNION ALL like this:
SELECT 'wk1' as week,
SUM( wk1 ) AS active
FROM user_activity
UNION ALL
SELECT 'wk2' as week,
SUM( wk2 ) AS active
FROM user_activity
UNION ALL
SELECT 'wk3' as week,
SUM( wk3 ) AS active
FROM user_activity;
Is there a way to achieve the same without using UNION ALL?
Thanks!
Edit: Impala version: 2.6.0
Reason for no UNION ALL: For each SELECT
, the entire table will be scanned from HDFS. If I have a huge table, this will result in OOM errors.
This should solve your performance issue.
The table is being scanned only once.
The records are not being duplicated X3.
The UNION ALL is used here only for a single record.
select concat('wk',cast(c.i as string)) as week
,case c.i
when 1 then wk1
when 2 then wk2
when 3 then wk3
end as active
from (select sum(wk1) AS wk1
,sum(wk2) AS wk2
,sum(wk3) AS wk3
from user_activity
) t
cross join ( select 1 as i
union all select 2
union all select 3
) c
;
+------+--------+
| week | active |
+------+--------+
| wk1 | 2 |
| wk2 | 1 |
| wk3 | 2 |
+------+--------+