I have a table with customer data, and variables for the number of values per row, the column prefix, and the filler value.
I'm trying to write a dynamic statement/query to transpose the customer data based on the ID column into a new table with the specified number of columns per row, named with the column prefix and sequential numbers. If any column doesn't have data, it should use the filler value.
Example:
Input table:
Cust_ID | Cust_Values |
---|---|
1 | CI10 |
1 | CI11 |
1 | CI12 |
1 | CI13 |
1 | CI14 |
2 | CI20 |
Variables:
values_per_row = 2
col_prefix = 'CV_'
filler_value = 'XX'
Expected output:
When values_per_row is 2:
ID | CV_1 | CV_2 |
---|---|---|
1 | CI10 | CI11 |
1 | CI12 | CI13 |
1 | CI14 | XX |
2 | CI20 | XX |
When values_per_row is 3:
ID | CV_1 | CV_2 | CV_3 |
---|---|---|---|
1 | CI10 | CI11 | CI12 |
1 | CI13 | CI14 | XX |
2 | CI20 | XX | XX |
The final output should have the column names, based on the col_prefix and values_per_row variables. The solution should be flexible and reusable so that it can be used to transpose data based on the variables.
I've been trying hard to generate this efficiently in snowflake for some time now. Any help/insight is greatly appreciated.
The following SQL gives the desired output for 3 columns (you can modify the code for other options for number of columns):
with t0(cust_id, cust_value) as (
select * from values
(1, 'CI10'),
(1, 'CI11'),
(1, 'CI12'),
(1, 'CI13'),
(1, 'CI14'),
(2, 'CI20')
), t1 as (
select
*,
row_number() over (partition by cust_id order by cust_value)-1 as rn
from t0
)
select
cust_id,
max(case when rn %3 = 0 then cust_value end) as cv_1,
ifnull(max(case when rn %3 = 1 then cust_value end),'XX') as cv_2,
ifnull(max(case when rn %3 = 2 then cust_value end),'XX') as cv_3
from t1
group by cust_id, floor(rn/3)
order by cust_id, floor(rn/3)
;
CUST_ID | CV_1 | CV_2 | CV_3 |
---|---|---|---|
1 | CI1O | CI11 | CI12 |
1 | CI13 | CI14 | XX |
2 | CI20 | XX | XX |
For a dynamic SQL approach where you specify the variables, generate the SQL dynamically, and return the results you can use the following:
declare
values_per_row int := 3;
col_prefix text := 'CV_';
filler_value text := 'XX';
sql text;
res resultset;
begin
sql := '
with t0 as (
select
*,
row_number() over (partition by cust_id order by cust_values)-1 as rn
from cust
)
select
cust_id
';
for i in 1 to values_per_row do
sql := sql || ', ifnull(max(case when rn % ' || values_per_row || ' = ' || i || '-1 then cust_values end), ''' || filler_value || ''') as ' || col_prefix || i;
end for;
sql := sql || ' from t0 group by cust_id, floor(rn/' || values_per_row || ') order by cust_id, floor(rn/' || values_per_row || ');';
res := (execute immediate :sql);
return table (res);
end;