Search code examples
kdb+qpython

kdb select only one row for each minute


i have some records on kdb, as you see we have 10 row for time: "8:33" and 19 row for time "8:34" and some other row for time "8:35".

now, how can i select only one row for each group of minute? for example final result should be like this: ( only one row which have older second should be chosen for final result)

result should be:

9  2022-04-15 08:33:59.453   b'buy'  b'ETH-USDT'
28 2022-04-15 08:34:45.511  b'sell'  b'ETH-USDT'
43 2022-04-15 08:35:33.451  b'sell'  b'ETH-USDT'

my kdb:

0  2022-04-15 08:33:52.514  b'sell'  b'ETH-USDT'
1  2022-04-15 08:33:56.459   b'buy'  b'ETH-USDT'
2  2022-04-15 08:33:58.465  b'sell'  b'ETH-USDT'
3  2022-04-15 08:33:58.847  b'sell'  b'ETH-USDT'
4  2022-04-15 08:33:58.848   b'buy'  b'ETH-USDT'
5  2022-04-15 08:33:58.849  b'sell'  b'ETH-USDT'
6  2022-04-15 08:33:58.849  b'sell'  b'ETH-USDT'
7  2022-04-15 08:33:59.262  b'sell'  b'ETH-USDT'
8  2022-04-15 08:33:59.262  b'sell'  b'ETH-USDT'
9  2022-04-15 08:33:59.453   b'buy'  b'ETH-USDT'
10 2022-04-15 08:34:02.057   b'buy'  b'ETH-USDT'
11 2022-04-15 08:34:02.387  b'sell'  b'ETH-USDT'
12 2022-04-15 08:34:02.388  b'sell'  b'ETH-USDT'
13 2022-04-15 08:34:03.086   b'buy'  b'ETH-USDT'
14 2022-04-15 08:34:06.176   b'buy'  b'ETH-USDT'
15 2022-04-15 08:34:11.061   b'buy'  b'ETH-USDT'
16 2022-04-15 08:34:14.330   b'buy'  b'ETH-USDT'
17 2022-04-15 08:34:15.381   b'buy'  b'ETH-USDT'
18 2022-04-15 08:34:24.464   b'buy'  b'ETH-USDT'
19 2022-04-15 08:34:26.146  b'sell'  b'ETH-USDT'
20 2022-04-15 08:34:26.394   b'buy'  b'ETH-USDT'
21 2022-04-15 08:34:26.571  b'sell'  b'ETH-USDT'
22 2022-04-15 08:34:27.175  b'sell'  b'ETH-USDT'
23 2022-04-15 08:34:27.654   b'buy'  b'ETH-USDT'
24 2022-04-15 08:34:28.537   b'buy'  b'ETH-USDT'
25 2022-04-15 08:34:30.264  b'sell'  b'ETH-USDT'
26 2022-04-15 08:34:38.094  b'sell'  b'ETH-USDT'
27 2022-04-15 08:34:41.202  b'sell'  b'ETH-USDT'
28 2022-04-15 08:34:45.511  b'sell'  b'ETH-USDT'
29 2022-04-15 08:35:02.625   b'buy'  b'ETH-USDT'
30 2022-04-15 08:35:04.008   b'buy'  b'ETH-USDT'
31 2022-04-15 08:35:06.377  b'sell'  b'ETH-USDT'
32 2022-04-15 08:35:23.843  b'sell'  b'ETH-USDT'
33 2022-04-15 08:35:23.845  b'sell'  b'ETH-USDT'
34 2022-04-15 08:35:23.845  b'sell'  b'ETH-USDT'
35 2022-04-15 08:35:23.845  b'sell'  b'ETH-USDT'
36 2022-04-15 08:35:23.845  b'sell'  b'ETH-USDT'
37 2022-04-15 08:35:23.845  b'sell'  b'ETH-USDT'
38 2022-04-15 08:35:26.986  b'sell'  b'ETH-USDT'
39 2022-04-15 08:35:31.219   b'buy'  b'ETH-USDT'
40 2022-04-15 08:35:31.221   b'buy'  b'ETH-USDT'
41 2022-04-15 08:35:31.897   b'buy'  b'ETH-USDT'
42 2022-04-15 08:35:31.898  b'sell'  b'ETH-USDT'
43 2022-04-15 08:35:33.451  b'sell'  b'ETH-USDT'

Solution

  • You can use:

    value select by b.minute from t
    

    where b is your column of timestamps and t is your table.

    Example:

    q)show t:([]a:til 10; b:.z.p-00:04 00:04 00:04 00:03 00:03 00:03 00:01 00:00 00:00 00:00; c:10?`3 )
    a b                             c
    -----------------------------------
    0 2022.04.15D09:09:11.100787000 ndd
    1 2022.04.15D09:09:11.100787000 hpb
    2 2022.04.15D09:09:11.100787000 ihm
    3 2022.04.15D09:10:11.100787000 fph
    4 2022.04.15D09:10:11.100787000 gmg
    5 2022.04.15D09:10:11.100787000 kpf
    6 2022.04.15D09:12:11.100787000 ecp
    7 2022.04.15D09:13:11.100787000 mai
    8 2022.04.15D09:13:11.100787000 bof
    9 2022.04.15D09:13:11.100787000 mno
    q)select by b.minute from t
    minute| a b                             c
    ------| -----------------------------------
    09:09 | 2 2022.04.15D09:09:11.100787000 ihm
    09:10 | 5 2022.04.15D09:10:11.100787000 kpf
    09:12 | 6 2022.04.15D09:12:11.100787000 ecp
    09:13 | 9 2022.04.15D09:13:11.100787000 mno
    q)value select by b.minute from t
    a b                             c
    -----------------------------------
    2 2022.04.15D09:09:11.100787000 ihm
    5 2022.04.15D09:10:11.100787000 kpf
    6 2022.04.15D09:12:11.100787000 ecp
    9 2022.04.15D09:13:11.100787000 mno