Search code examples
pythonpython-3.xkdb+qpythonqpython3

How to fetch the table or view metadata from KDB and save to a data-structure?


I have been trying to fetch the metadata from a KDB+ Database using python, basically, I installed a library called qpython and using this library we connect and query the KDB+ Database.

I want to store the metadata for all the appropriate cols for a table/view in KDB+ Database using python. I am unable to separate the metadata part, despite trying myriad different approaches.

Namely a few to typecast the output to list/tuple, iterating using for, et cetera.

from qpython import qconnection

def fetch_metadata_from_kdb(params):
    try:
        kdb_connection_obj = qconnection.QConnection(host=params['host'], port=params['port'], username=params['username'], password=params['password'])
        kdb_connection_obj.open()
        PREDICATE = "meta[{}]".format(params['table'])
        metadata = kdb_connection_obj(PREDICATE)
        kdb_connection_obj.close()
        return metadata

    except Exception as error_msg: 
        return error_msg

def fetch_tables_from_kdb(params):
    try:
        kdb_connection_obj = qconnection.QConnection(host=params['host'], port=params['port'], username=params['username'], password=params['password'])
        kdb_connection_obj.open()
        tables = kdb_connection_obj("tables[]")
        views = kdb_connection_obj("views[]")
        kdb_connection_obj.close()
        return [table.decode() for table in list(tables)], [view.decode() for view in list(views)]

    except Exception as error_msg:
        return error_msg

parms_q = {'host':'localhost', 'port':5010,
           'username':'kdb', 'password':'kdb', 'table':'testing'}

print("fetch_tables_from_kdb:", fetch_tables_from_kdb(parms_q), "\n")
print("fetch_metadata_from_kdb:", fetch_metadata_from_kdb(parms_q), "\n")

The output which I am currently getting is as follows;

fetch_tables_from_kdb: (['testing'], ['viewname']) 

fetch_metadata_from_kdb: [(b'time',) (b'sym',) (b'price',) (b'qty',)]![(b'p', b'', b'') (b's', b'', b'') (b'f', b'', b'') (b'j', b'', b'')] 

I am not able to separate the columns part and the metadata part. How to store only the metadata for the appropriate column for a table/view in KDB using python?


Solution

  • The metadata that you have returned from kdb is correct but is being displayed in python as a kdb dictionary format which I agree is not very useful.

    If you pass the pandas=True flag into your qconnection call then qPython will parse kdb datastructures, such as a table into pandas data structures or sensible python types, which in your case looks like it will be more useful.

    Please see an example below - kdb setup (all on localhost)

    $ q -p 5000
    q)testing:([]date:.z.d+0 1 2;`g#sym:`abc`def`ghi;num:`s#10 20 30)
    q)testing
    date       sym num
    ------------------
    2022.01.31 abc 10
    2022.02.01 def 20
    2022.02.02 ghi 30
    q)meta testing
    c   | t f a
    ----| -----
    date| d
    sym | s   g
    num | j   s
    

    Python code

    from qpython import qconnection
    
    #create and open 2 connections to kdb process - 1 without pandas flag and one
    q = qconnection.QConnection(host="localhost", port=5000)
    qpandas = qconnection.QConnection(host="localhost", port=5000, pandas=True)
    q.open()
    qpandas.open()
    
    #see what is returned with a q table 
    print(q("testing"))
    [(8066, b'abc', 10) (8067, b'def', 20) (8068, b'ghi', 30)]
    #the data is a qPython data object
    type(q("testing"))
    qpython.qcollection.QTable
    
    #whereas using the pandas=True flag a dataframe is returned.
    print(qpandas("testing"))
            date     sym  num
    0 2022-01-31  b'abc'   10
    1 2022-02-01  b'def'   20
    2 2022-02-02  b'ghi'   30
    
    #This is the same for the meta of a table
    print(q("meta testing"))
    [(b'date',) (b'sym',) (b'num',)]![(b'd', b'', b'') (b's', b'', b'g') (b'j', b'', b's')]
    
    print(qpandas("meta testing"))
             t    f     a
    c
    b'date'  d  b''   b''
    b'sym'   s  b''  b'g'
    b'num'   j  b''  b's'
    

    With the above you can now access the columns and rows using pandas (the b'num' etc is the qPython way of expressing a backtick `

    Also now you have the ability to now use the DataFrame.info() to extract datatypes if you are more intrested in the python data structure rather than the kdb data structure/types. qPython will convert the q types to sensible python types automatically.

    qpandas("testing").info()
    <class 'pandas.core.frame.DataFrame'>
    RangeIndex: 3 entries, 0 to 2
    Data columns (total 3 columns):
     #   Column  Non-Null Count  Dtype
    ---  ------  --------------  -----
     0   date    3 non-null      datetime64[ns]
     1   sym     3 non-null      object
     2   num     3 non-null      int64
    dtypes: datetime64[ns](1), int64(1), object(1)
    memory usage: 200.0+ bytes