Search code examples
kdb

Optimize the query using group by on certain condition in q kdb


We have a table t as below

q)t:([] sym:10?`GOOG`AMZN`IBM; px:10?100.; size:10?1000; mkt:10?`ab`cd`ef)

Our requirement is to 'group by' the table 't' by column 'sym' if column 'mkt' value is 'ef', for rest of the markets('ab`cd') we need all the values(not group by). For this use case I have written below query which works as expected,

q)(select px, size, sym, mkt from select by sym from t where mkt=`ef), select px, size, sym, mkt from t where mkt in `ab`cd

please help me optimize the above query in a way i.e

sudo code - 
if mkt=`ef: 
    then use group by on table
else if mkt in `ab`cd
    don't use group by on table

Solution

  • I have found two different ways to make your query that are different from the one you have provided.

    You can use the following query to accomplish what you want in one select statement:

    select from t where (mkt<>`ef)|(mkt=`ef)&i=(last;i)fby ([]sym;mkt)
    

    However if you compare its speed:

    q)\t:1000 select from t where (mkt<>`ef)|(mkt=`ef)&i=(last;i)fby ([]sym;mkt)
    68
    

    to your original query:

    q)\t:1000 (select px, size, sym, mkt from select by sym from t where mkt=`ef), select px, size, sym, mkt from t where mkt in `ab`cd
    40
    

    You can see that your query is faster.

    Additionally you can try this which does not require explicitly stating every mkt in t you wish to not group by sym

    (0!select by sym from t where mkt=`ef),select from t where mkt<>`ef
    

    But again this ends up being around the same speed as your original solution:

    q)\t:1000 (0!select by sym from t where mkt=`ef),select from t where mkt<>`ef
    42
    

    So in terms of optimization it seems your query works well for what you want it to accomplish.