In KDB, for each ticker
I'd like to get all the rows where signal=1
except the first occurrence in each ticker.
og:([]
ticker:raze 5#/:(`AAPL`GOOGL`MSFT);
date:raze 3#enlist 2023.01.01+til 5;
price:45 46 47 48 49,100 102 101 103 105,200 201 199 202 205;
volume:1000 1100 900 1200 1300,500 520 510 530 550,300 310 290 320 330;
signal:0 0 1 1 0,0 1 1 1 0,0 0 0 1 1 / Example entry signals
);
t:select from og where signal=1;
t:update valid:1b from t where date>(min;date) fby ticker;
t:select from t where valid;
show t: delete valid from t; / correct but verbose
q)og
ticker date price volume signal
-------------------------------------
AAPL 2023.01.01 45 1000 0
AAPL 2023.01.02 46 1100 0
AAPL 2023.01.03 47 900 1
AAPL 2023.01.04 48 1200 1
AAPL 2023.01.05 49 1300 0
GOOGL 2023.01.01 100 500 0
GOOGL 2023.01.02 102 520 1
GOOGL 2023.01.03 101 510 1
GOOGL 2023.01.04 103 530 1
GOOGL 2023.01.05 105 550 0
MSFT 2023.01.01 200 300 0
MSFT 2023.01.02 201 310 0
MSFT 2023.01.03 199 290 0
MSFT 2023.01.04 202 320 1
MSFT 2023.01.05 205 330 1
q)t
ticker date price volume signal
-------------------------------------
AAPL 2023.01.04 48 1200 1
GOOGL 2023.01.03 101 510 1
GOOGL 2023.01.04 103 530 1
MSFT 2023.01.05 205 330 1
I find my approach quite verbose, and I'm wondering if one can be terser and hopefully more efficient. Just for reference, the naive/incorrect select
statement below gives an undesired result
q) select by ticker from og where date>date[first where signal=1] / incorrect
ticker| date price volume signal
------| ------------------------------
AAPL | 2023.01.05 49 1300 0
GOOGL | 2023.01.05 105 550 0
MSFT | 2023.01.05 205 330 1
You can achieve this succinctly using fby
(https://code.kx.com/q/ref/fby), and the virtual index column i
:
q)select from og where signal=1,i>(first;i)fby ticker
ticker date price volume signal
-------------------------------------
AAPL 2023.01.04 48 1200 1
GOOGL 2023.01.03 101 510 1
GOOGL 2023.01.04 103 530 1
MSFT 2023.01.05 205 330 1