Search code examples
kdb

KDB: Select rows corresponding to two updated columns


Consider financial quote data in which the bid and ask don't always get updated simultaneously. I would like to select just the rows where both the bid and the ask reflect the new market levels. In the table below, this would correspond to selecting rows t1,t5,t7,t9. Any elegant way of doing this? (Alternatively, I would like to exclude rows t2,t3,t4, which correspond to times where only one of bid/ask has been updated).

time bid ask
t1 12 13
t2 12 14
t3 12 14
t4 12 14
t5 13 14
t6 13 14
t7 14 15
t8 14 15
t9 13 14

Solution

  • This should do the trick (haven't tested it a huge amount)

    tab:([] time:`t1`t2`t3`t4`t5`t6`t7`t8`t9;bid:12 12 12 12 13 13 14 14 13;ask:13 14 14 14 14 14 15 15 14)
    
    q)select from tab where differ {$[all x<y;y;x]}\[flip sums each differ each (bid;ask)]
    time bid ask
    ------------
    t1   12  13 
    t5   13  14 
    t7   14  15 
    t9   13  14 
    
    Another example which contains more edge cases:
    
    tab:([] time:`g`b`b`b`b`g`b`b`g`b`g`g`b`g;bid:12 12 12 12 12 13 13 14 13 13 14 13 14 14;ask:13 13 14 15 14 14 14 14 15 16 16 15 15 16)
    
    q)select from tab where differ {$[all x<y;y;x]}\[flip sums each differ each (bid;ask)]
    time bid ask
    ------------
    g    12  13 
    g    13  14 
    g    13  15 
    g    14  16 
    g    13  15 
    g    14  16 
    

    There may be a slightly cleaner way of doing this, but I would test this method for now.

    EDIT: More efficient to do the flip after the sums - changed above.

    A previous approach I looked at used just the booleans from the differ on each column. This method will work (and might be more intuitive) but is less efficient both in terms of time and memory so I would stick to something close to the first approach above.

    scanner:{if[all x;x:not x];$[(y&z)|(x[0]&z)|x[1]&y;11b;x|(y;z)]}
    
    q)select from tab where all each scanner\[00b;differ bid;differ ask]
    time bid ask
    ------------
    t1   12  13
    t5   13  14
    t7   14  15
    t9   13  14