Search code examples
kdb

Pivot table in kdb+/q


I'm trying to pivot some trade data in KDB/q. Although my data are only slightly different from the working example on the website (see the general pivot function: http://code.kx.com/q/cookbook/pivoting-tables/), I can't get the function to work, even after several hours of trying (I'm very new to KDB).

Put simply, I'm trying to go from this table:

q)5# trades_agg
date       sym  time  exchange buysell| shares
--------------------------------------| ------
2009.01.05 aaca 09:30 BATS     B      | 484
2009.01.05 aaca 09:30 BATS     S      | 434
2009.01.05 aaca 09:30 NASDAQ   B      | 235
2009.01.05 aaca 09:30 NASDAQ   S      | 429
2009.01.05 aaca 09:30 NYSE     B      | 309

to this one:

date       sym  time  | BATSsharesB BATSsharesS NASDAQsharesB    ... 
----------------------| -----------------------------------------------
2009.01.05 aaca 09:30 | 484          434        235              ...
...                   | ... 

I'll provide a working example to illustrate things:

// Create data
qpd:5*2*4*"i"$16:00-09:30
date:raze(100*qpd)#'2009.01.05+til 5
sym:(raze/)5#enlist qpd#'100?`4
sym:(neg count sym)?sym
time:"t"$raze 500#enlist 09:30:00+15*til qpd
time+:(count time)?1000
exchange:raze 500#enlist raze(qpd div 3)#enlist`NYSE`NASDAQ`BATS
buysell:raze 500#enlist raze(qpd div 2)#enlist`B`S
shares:(500*qpd)?100
trades:([]date;sym;time;exchange;buysell;shares)
//I then aggregate the data into equal sized buckets
trades_agg: select sum shares by date, sym, time: 15 xbar time.minute, exchange, buysell from trades

// pivot function from the code.kx.com website
piv:{[t;k;p;v;f;g]
 v:(),v;
 G:group flip k!(t:.Q.v t)k;
 F:group flip p!t p;
 count[k]!g[k;P;C]xcols 0!key[G]!flip(C:f[v]P:flip value flip key F)!raze
  {[i;j;k;x;y]
   a:count[x]#x 0N;
   a[y]:x y;
   b:count[x]#0b;
   b[y]:1b;
   c:a i;
   c[k]:first'[a[j]@'where'[b j]];
   c}[I[;0];I J;J:where 1<>count'[I:value G]]/:\:[t v;value F]}

I subsequently apply this pivot function to the example with the functions f and g set to their default (::) values but I get an error message:

piv[`trades_agg;`date`sym`time;`exchange`buysell;`shares;(::);(::)]

Even when I use the suggested f and g functions it doesn't work:

 f:{[v;P]`$raze each string raze P[;0],'/:v,/:\:P[;1]}
 g:{[k;P;c]k,(raze/)flip flip each 5 cut'10 cut raze reverse 10 cut asc c}

I don't get why this is not working correctly since it is so close to the example on the website.


Solution

  • This is a self-contained version that's easier to use:

    tt:1000#0!trades_agg
    
    piv:{[t;k;p;v]
        / controls new columns names
        f:{[v;P]`${raze " " sv x} each string raze P[;0],'/:v,/:\:P[;1]};
         v:(),v; k:(),k; p:(),p; / make sure args are lists
         G:group flip k!(t:.Q.v t)k;
         F:group flip p!t p;
         key[G]!flip(C:f[v]P:flip value flip key F)!raze
          {[i;j;k;x;y]
           a:count[x]#x 0N;
           a[y]:x y;
           b:count[x]#0b;
           b[y]:1b;
           c:a i;
           c[k]:first'[a[j]@'where'[b j]];
           c}[I[;0];I J;J:where 1<>count'[I:value G]]/:\:[t v;value F]};
    
    
    
    q)piv[`tt;`date`sym`time;`exchange`buysell;enlist `shares]
    date       sym  time | BATS shares B BATS shares S NASDAQ shares B NASDAQ sha..
    ---------------------| ------------------------------------------------------..
    2009.01.05 adkk 09:30| 577           359           499             452       ..
    2009.01.05 adkk 09:45| 882           501           339             467       ..
    2009.01.05 adkk 10:00| 620           513           411             128       ..
    2009.01.05 adkk 10:15| 501           544           272             544       ..
    2009.01.05 adkk 10:30| 291           594           363             331       ..
    2009.01.05 adkk 10:45| 867           500           498             536       ..
    2009.01.05 adkk 11:00| 624           632           694             493       ..
    2009.01.05 adkk 11:15| 99            704           600             299       ..
    2009.01.05 adkk 11:30| 269           394           280             392       ..
    2009.01.05 adkk 11:45| 635           744           758             597       ..
    2009.01.05 adkk 12:00| 562           354           498             405       ..
    2009.01.05 adkk 12:15| 416           437           303             492       ..
    2009.01.05 adkk 12:30| 447           699           370             302       ..
    2009.01.05 adkk 12:45| 336           647           512             245       ..
    2009.01.05 adkk 13:00| 692           457           497             553       ..
    

    9 years later I have created a UI that allows you to pivot without code, it pushes down the piv function to kdb: https://www.timestored.com/pulse/help/table-pivot