reshaping table based on column values

I was looking at a problem of reshaping a table creating new columns according based on values. I'm using the same example as this problem discussed there: A complicated sum in R data.table that involves looking at other columns

so I have a table:

df:([]ID:1+til 5;
        Group:1 1 2 2 2;
        V1:10 + 2 * til 5;
        Type_v1:`t1`t2`t1`t1`t2;
        V2:3 0N 0N 7 8;
        Type_v2:`t2```t3`t3);

ID Group V1 Type_v1 V2 Type_v2
------------------------------
1  1     10 t1      3  t2     
2  1     12 t2                
3  2     14 t1                
4  2     16 t1      7  t3     
5  2     18 t2      8  t3

and the goal is to transform it to get the sum of values by group and type. please note the new columns created. basically all types in Type_v1 and Type_v2 are used to create columns for the resulting table.

#   group v_1 type_1 v_2 type_2 v_3 type_3
#1:     1  10     t1  15     t2  NA   <NA>
#2:     2  30     t1  18     t2  15     t3

I did the beginning but I am unable to transform the table and create the new columns. also of course I'm trying to get all the columns created in a dynamic way, as it would not be possible to input 20k columns manually.

df1:select Group, Value:V1, Type:Type_v1 from df;
df2:select Group, Value:V2, Type:Type_v2 from df;
tr:df1,df2;
tr:0!select sum Value by Group, Type from tr where Type <> ` ;

basically I'm missing the equivalent of:
dcast(tmp, group ~ rowid(group), value.var = c("v", "type"))

any help and explanations appreciated,

Solution

You could expand on Terry's pivot to dynamically do the select parts above using functional form. See more detail here:

https://code.kx.com/q/basics/funsql/

// Personally, I would try to stay clear of column names too similar to reserved keywords in kdb
df: `id`grpCol`v_1`typCol_1`v_2`typCol_2 xcol df;

{[df;n]
   
    // dynamically create cols from 1 to n
    cls:`$("v_";"typCol_"),\:/:string 1 + til n;

    // functional form of select for each type/value col before joining together
    df:(,/) {?[x;();0b;`grpCol`v`typCol!`grpCol,y]}[df] each cls;

    // sum, then pivot
    df:0!select sum v by grpCol, typCol from df where typCol <> `;
    P:exec distinct typCol from df;
    df:exec P#(typCol!v) by grpCol:grpCol from df;

    // Type cols seem unnecessary but 
    // Can be done with another functional select
    ?[df;();0b;(`grpCol,raze P,'`$"typCol_",/:string 1 + til count P)!`grpCol,raze flip (P;enlist each P)]

    }[df;2]

grpCol  t1  typCol_1    t2  typCol_2    t3  typCol_3
1       10  t1          15  t2          0N  t3
2       30  t1          18  t2          15  t3

EDIT - More detailed breakdown below:

cls:`$("v_";"typCol_") ,\:/: string 1 + til n;

Dynamically create a symbol list for the columns as they are required for column names when using functional form. I start by creating a list of v_ and typCol_ up to number n.

,\:/: -> join with each left and each right iterators

https://code.kx.com/q/ref/maps/#each-left-and-each-right

This allows me to join every item on the left ("v_";"typCol_") with every item on the right.

The same could be achieved with cross but you would have to restructure the list with flip and cut flip n cut `$("v_";"typCol_") cross string 1 + til n

(,/) {?[x;();0b;`grpCol`v`typCol!`grpCol,y]}[df] each cls;

(,/) -> This is the over iterator used with join. It takes the 1st table, joins it to the 2nd, then takes that and joins on to the 3rd etc.

https://code.kx.com/q/ref/over/

{?[x;();0b;`grpCol`v`typCol!`grpCol,y]}[df] each cls

// functional select

?[table;  where;  by;  columns]
?[x;      ();     0b;  `grpCol`v`typCol!`grpCol,y]

This creates a list of tables, 1 for each column pair in the cls variable. Notice how I don't explicitly state x or y in the function like this {[x;y]}. This is because x y and z can be used implicitly, so this function works with or without.

The important part here is the last param (columns). For a functional select it is a dictionary with column names as the key and what the columns are as the values

e.g. `grpCol`v`typCol!`grpCol`v_1`typCol_1 -> this is renaming each v and typCol so they are the same to then join them all together with (,/).

There is a useful keyword to help with figuring out functional form -> parse

parse"select Group, Value:V1, Type:Type_v1 from df"

0   ?
1   `df
2   ()
3   0b
4   (`Group`Value`Type)!`Group`V1`Type_v1

    P:exec distinct typCol from df;
    df:exec P#(typCol!v) by grpCol:grpCol from df;

pivoting is outlined here: https://code.kx.com/q/kb/pivoting-tables/

It effectively flips/rotates a section of the table. It takes the distinct types from typCol as the columns and uses the v column as the rows for each corresponding typCol

?[table;  where;  by;  columns]
?[df;();0b;(`grpCol,raze P,'`$"typCol_",/:string 1 + til count P)!`grpCol,raze flip (P;enlist each P)]

Again look at the last param in the functional select i.e. columns. This is how it looks after being dynamically generated:

(`grpCol`t1`typCol_1`t2`typCol_2`t3`typCol_3)!(`grpCol;`t1;enlist `t1;`t2;enlist `t2;`t3;enlist `t3)

It is kind of a hacky way to get the type columns, I select each t1 t2 t3 with a typeCol_1 _2 _3,

`t1 = (column) `t1
`typCol_1 = enlist `t1 -> the enlist here tells kdb I want the value `t1 rather than the column