Search code examples
kdb+qsqlqueryqsqldatabaseqsqlquerymodel

KDB/Q-sql Dynamic Grouping and con-canting columns in output


I have a table where I have to perform group by on dynamic columns and perform aggregation, result will be column values concatenating group-by tables and aggregations on col supplied by users.

For example :

g1  g2  g3  g4  col1    col2
A   D   F   H   10      20
A   E   G   I   11      21
B   D   G   J   12      22
B   E   F   L   13      23
C   D   F   M   14      24
C   D   G   M   15      25

and if I need to perform group by g1,g2,g4 and avg aggregation on col1 output should be like this

    filed           val
Avg[A-D-H-col1]    10.0
Avg[A-E-I-col1]    11.0
Avg[B-D-J-col1]    12.0
Avg[B-E-L-col1]    13.0
Avg[C-D-M-col1]    14.5

I am able to perform this if my group by columns are fixed using q-sql

t:([]g1:`A`A`B`B`C`C;g2:`D`E`D`E`D`D;g3:`F`G`G`F`F`G;g4:`H`I`J`L`M`M;col1:10 11 12 13 14 15;col2:20 21 22 23 24 25)

select filed:first ("Avg[",/:(({"-" sv x} each string (g1,'g2,'g4)),\:"-col1]")),val: avg col1 by g1,g2,g4 from t

I want to use functional query for the same , means I want a function which take list of group by columns, aggregation to perform and col name andtable name as input and output like above query. I can perform group by easily using dynamic columns but not able to con-cat in fields. function signature will be something like this

fun{[glist; agg; col,t] .. ;... }[g1g2g4;avg;col1,t]

Please help me to make above query as dynamic.


Solution

  • You may try following function:

    specialGroup: {[glist;agg;col;table]
    
        res: ?[table;();{x!x}glist; enlist[`val]!enlist(agg;col)];
        aggname: string agg;
        aggname: upper[1#aggname], 1_aggname;
        res: ![res;();0b;enlist[`filed]!enlist({(y,"["),/:("-"sv/:string flip x),\:"]"};enlist,glist,enlist[enlist col];aggname)];
        res
    };
    
    specialGroup[`g1`g2`g4;avg;`col1;t]
    

    specialGroup aggregates values into val column first. And populates filed column after grouping. This helps to avoid generating filed duplicates and selecting first of them.