Search code examples
partitionkdb

KDB: How to Delete rows from Partitioned Table


I have the below query used to delete rows from a partitioned table, but it doesn't work. What is the approach used for deleting rows in a partitioned table?

delete from SecurityLoan where lender=`SCOTIA, date in inDays, portfolio in portfoliolist

Note that inDays and portfoliolist are lists


Solution

  • Here's a slightly different method that re-indexes a column in a partition to a new list of indices that you want to keep in that column.

    It still follows the same semantics of reading a column in, amending and then resetting it back to disk, just uses a slightly different approach. However, by doing it this way you can grab the indices you want to remove, simply by using a qsql query. It then grabs the full list of indices in a partition, and runs 'except' against the initial list, resulting in the ones you actually want to keep.

    It becomes powerful when all you want to do is delete the contents of a sql query from a database/table (as is the case in yours).

    // I've commented this function as much as possible to break it down and explain the approach
    // db is where the database lives (hsym)
    // qry is the qsql query (string)
    q)delFromDisk:{[db;qry]
        // grab the tree from the query
        q:parse qry; 
        // cache partition counts
        .Q.cn `. t:q 1;
        // grab i by partition for your qry using the where clause
        d:?[t;raze q 2;{x!x}1#f:.Q.pf;enlist[`delis]!1#`i];
        // grab full indice list for each partition
        a:1!flip (f,`allis)!(`. f;til each .Q.pn t);
        // run except on full indice list and your query's indice list
        r:update newis:allis except'delis from a,'d;
        // grab columns except partition domain
        c:cols[t] except .Q.pf;
        // grab partitions that actually need modifications and make them dir handles
        p:update dirs:.Q.par[db;;t] each p[.Q.pf] from p:0!select from r where not allis~'newis;
        // apply on disk to directory handle (x), on column (y), to new indices (z)
        m:{@[x;y;@;z]};
        // grab params from p
        pa:`dirs`c`newis#p cross ([]c);
        // modify each column in a partition, one partition at a time
        m .' value each pa
        };
    
    // test data/table
    q)portfolio:`one`two`three`four`five;
    q)lender:`user1`user2`user3`user4;
    q)n:5;
    // set to disk in date partitioned format
    q)`:./2017.01.01/secLoan/ set .Q.en[`:./] ([]lender:n?lender;portfolio:n?portfolio);
    q)`:./2017.01.02/secLoan/ set .Q.en[`:./] ([]lender:n?lender;portfolio:n?portfolio);
    // load db
    q)\l .
    // lets say we want to delete from secLoan where lender in `user3 and portfolio in `one`two`three
    // please note, this query does not have a date constraint, so it may be an inefficient query if you where-clause produces large results. Once happy with the util as a whole, it can be re-jigged to select+delete per partition
    q)select from secLoan where lender in `user3,portfolio in `one`two`three
    date       lender portfolio
    ---------------------------
    2017.01.01 user3  one
    2017.01.01 user3  two
    2017.01.02 user3  one
    // 3 rows need deleted, 2 from first partition, 1 from second partition
    // 10 rows exist
    q)count secLoan
    10
    
    // run delete function
    q)delFromDisk[`:.;"select from secLoan where lender in `user3,portfolio in `one`two`three"];
    // reload to see diffs
    q)\l .
    q)count secLoan
    7
    // rows deleted
    q)secLoan
    date       lender portfolio
    ---------------------------
    2017.01.01 user2  five
    2017.01.01 user4  three
    2017.01.01 user2  three
    2017.01.02 user2  five
    2017.01.02 user2  two
    2017.01.02 user4  three
    2017.01.02 user1  five
    
    // PS - can accept a delete qsql query as all the function does is look at the where clause
    // delFromDisk[`:.;"delete from secLoan where lender in `user3,portfolio in `one`two`three"]