Search code examples
kdb

Get Default value instead of null while doing KDB select


I wanted to get default value when particular column values is null or blank.

e.g select customer,date,product,orderId,version,size from tableA where date=2020.04.08,product in (`Derivative)

returns :

+----------+----------+------------+---------+---------+------+
| customer |   date   |  product   | orderId | version | size |
+----------+----------+------------+---------+---------+------+
| XYZ fund | 4/8/2020 | Derivative |       1 |       6 |      |
| XYZ fund | 4/8/2020 | Derivative |       2 |       6 | 1000 |
| XYZ fund | 4/8/2020 | Derivative |       3 |       4 |      |
+----------+----------+------------+---------+---------+------+

So I want NA instead of blank or null s following :

+----------+----------+------------+---------+---------+------+
| customer |   date   |  product   | orderId | version | size |
+----------+----------+------------+---------+---------+------+
| XYZ fund | 4/8/2020 | Derivative |       1 |       6 |  NA  |
| XYZ fund | 4/8/2020 | Derivative |       2 |       6 | 1000 |
| XYZ fund | 4/8/2020 | Derivative |       3 |       4 |  NA  |
+----------+----------+------------+---------+---------+------+

Solution

  • This is difficult to do with a value like NA because q prefers vectors to be of uniform type. Assuming that your size column is a numeric type rather than a symbol you'll run into issues.

    I'd advise filling in some numeric value with the ^ fill operator. For example, to fill in zero as the default value:

    q)res:select customer,date,product,orderId,version,size from tableA where date=2020.04.08,product in (`Derivative)
    q)update 0^size from res
    customer date     product    orderId version size
    -------------------------------------------------
    XYZ fund 4/8/2020 Derivative 1       6       0  
    XYZ fund 4/8/2020 Derivative 2       6       1000
    XYZ fund 4/8/2020 Derivative 3       4       0  
    

    If it's important to use the NA value then you can cast the column to symbol type and fill with `NA:

    q)res:select customer,date,product,orderId,version,size from tableA where date=2020.04.08,product in (`Derivative)
    q)update `NA^`$string size from res
    customer date     product    orderId version size
    -------------------------------------------------
    XYZ fund 4/8/2020 Derivative 1       6       NA  
    XYZ fund 4/8/2020 Derivative 2       6       1000
    XYZ fund 4/8/2020 Derivative 3       4       NA  
    

    The downside of doing this is that you will no longer be able to do numeric operations on the column without first casting back to a numeric type:

    q)select sum size from res
    'type
      [0]  select sum size from res
    q)select sum "J"$string size from res
    size
    ----
    1000