Search code examples
kdbq-lang

Multithreaded rdb in kdb


I have some memory constraints in my current 32-bit kdb+/tick setup,in which my rdb consumes by far the most.

I know I can get around the 4 GB 32-bit addressability limit by using multiple threads with a -s tag on initialization of the q process, but I'm not sure how apply that to an rdb, where the only function generating data is upd:insert.

Is it possible to allocate memory from other threads manually?


Solution

  • As far as I'm aware, you can't just use threads as an extra source of memory to store RDB data. (they do have their own heaps but the workings are all under the covers and not exposed enough for you to hijack them). Threads are designed for parallelizing functions/queries on read-only data.

    A couple of thoughts:

    1) You could create an architecture whereby you have multiple RDBs each subscribing to a subset of tables:

    RDB1 - subscribes to table1

    RDB2 - subscribes to table2, table3

    RDB3 - subscribes to table4

    Then you create a gateway process which is connected to each RDB. The gateway should determine which table you're trying to query and route the query to the relevant RDB.

    2) If a single table (e.g. quote table) is still too big to be stored in a single 4gb process then you might have to think about splitting that table by ticker (i.e. RDB1a subscribes to quote table for tickers A-M while RDB1b subscribes to quote table for tickers N-Z). Then your gateway would have to be clever enough to know which tickers are being requested and route the query accordingly.

    3) If having an entire days worth of data in the RDB at all times is not actually needed (i.e you're only really using the RDB to save data to disk at end-of-day) then you should consider using the alt-RDB which saves to disk periodically and keeps a smaller amount of data in memory at any given time (http://code.kx.com/q/cookbook/w-q/)

    4) If you are serious about storing all data in memory at all times, and you are collecting full trade/quote data then the only clean way to achieve this is with a production licence.