I've written the following query:
select first(o), max(h), min(l), last(c), sum(v) by sym, t.date from disk
where 'disk' is a parted splayed disk table containing 160m rows.
When I run this query, all RAM gets consumed and then KDB dies.
What am I doing wrong?
Edit My schema is:
c | t f a
---| -----
t | p
o | f
h | f
l | f
c | f
v | i
sym| s p
There is a helpful function in the .Q
namespace, .Q.dpft
here https://code.kx.com/q/ref/dotq/#qdpft-save-table which saves tables splayed to a partitioned database, that you can wrap in a lambda and pass each date e.g.
q) / table might be called alltrades
q)table:([]date:"d"$100?7000+til 10;sym:100?`AAA`BBB`CCC;price:100?100.0)
q)table
date sym price
-----------------------
2019.03.10 BBB 49.81119
2019.03.08 CCC 8.997612
2019.03.08 CCC 22.74166
2019.03.06 BBB 86.544
..
q) / below, t might be called trade
q){t::select from table where date=x; .Q.dpft[`:hdb; x; `sym;`t]}'[exec distinct date from table]
`t`t`t`t`t`t`t`t`t`t
q) \\
user@host:~$ cd hdb
user@host:~/hdb$ ls
2019.03.02 2019.03.03 2019.03.04 2019.03.05 2019.03.06 2019.03.07 2019.03.08 2019.03.09 2019.03.10 2019.03.11 sym
user@host:~/hdb$ cd 2019.03.02
user@host:~/hdb/2019.03.02$ ls
t
user@host:~/hdb/2019.03.02$ cd t
user@host:~/hdb/2019.03.02/t$ ls
date price sym
where hdb
is the directory path you want to save to, sym
is the symbol column which will be automatically enumerated and table
is the table in memory which you want to partition.