Search code examples
kdbq-lang

How to update dependent values in KDB table


I am struggling finding an efficient way of updating column values fv when it depends on the previous values of some other columns sAA and sBB. These in turn needs to be updated with the newly calculated version of fv for the current date, then the next fv can be calculated, and so on.

To illustrate the problem, consider a simplistic example: Let's say I have a table of some prices of some stocks AA and BB, and weights wAA and wBB, per trade date td. Additionally I have a fund value fv and shares owned columns sAA and sBB.

q)t:([] td:2001.01.01 2001.01.02 2001.01.03 2001.01.04 2001.01.05 2001.01.06; 
AA:121.5 125.0 127.0 126.0 129.2 130.0; 
BB:111.0 115.3 117.0 116.0 119.2 120.0; 
wAA: 0.2 0.2 0.3 0.4 0.5 0.9; 
wBB: 0.8 0.8 0.7 0.6 0.5 0.1; 
fv:100000 0N 0n 0n 0n 0n;
sAA: 0n; sBB: 0n;)

(Roundtrip: 000ms)
::

q) t

td          AA      BB      wAA     wBB     fv       sAA   sBB
2001.01.01  121.5   111     0.2     0.8     100000    0n    0n
2001.01.02  125     115.3   0.2     0.8     0n        0n    0n
2001.01.03  127     117     0.3     0.7     0n        0n    0n
2001.01.04  126     116     0.4     0.6     0n        0n    0n
2001.01.05  129.2   119.2   0.5     0.5     0n        0n    0n
2001.01.06  130     120     0.9     0.1     0n        0n    0n

The fund value changes from date to date and set to, as an example, 1,000,000 for the first initial date (td(0)) but unknown for the rest. To calculate fv for any other row but the first: AA*(prev sAA) + BB*(prev sBB), i.e. dependent on the previous sAA and sBB values..

Both sAA and sBB is set to 0n for all dates and calculated as (pseudo code) sAA : fv * wAA % AA. For the first row, this is fine as fv(0) is known. For consecutive rows however I'm experiencing problems as KDB/Q operates in a column order and inline executions of updates does not update the table until the end. Consequently sAA and sBB remains 0n, and so does the consecutive fv values.

The end product should look similar to this:

 td          AA     BB     wAA    wBB    fv         sAA        sBB
2001.01.01   121.5  111    0.2    0.8    100000     164.6091   720.7207
2001.01.02   125    115.3  0.2    0.8    103675.2   165.8804   719.3425
2001.01.03   127    117    0.3    0.7    105229.7   248.574    629.5792
2001.01.04   126    116    0.4    0.6    104344.3   331.2519   539.7121
2001.01.05   129.2  119.2  0.5    0.5    107177.4   414.7732   449.5696
2001.01.06   130    120    0.9    0.1    107885.7   746.9007   89.90472

What is an efficient way to get from the starting point t to the end point above? If you restructure the data then I'd appreciate a line or two explaining that (I should be able to follow other code).

Whilst I have solved it (and I include my solution below, shameful as it may be), my data and, to some extent, my own thinking is suffering from the classic "short and fat" and "row by row" thinking and approach (bad form in Q). My solution is therefore not scalable or easily maintained, calculates the same value several times, traverses the data multiple times, uses "global" variables, and other poor code. I have tried a few other things before arriving at the below, such as executing an update statement updating fv, sAA and sBB in the same statement however as KDB works on columns it doesn't update sAA and sBB before moving onto the next fv.

My current and poor solution (just imagine the overhead of adding another 100 or 1000 stocks... not really conducive use of anyone's time):

t:([] td:2001.01.01 2001.01.02 2001.01.03 2001.01.04 2001.01.05 2001.01.06; 
AA:121.5 125.0 127.0 126.0 129.2 130.0; 
BB:111.0 115.3 117.0 116.0 119.2 120.0; 
wAA: 0.2 0.2 0.3 0.4 0.5 0.9; 
wBB: 0.8 0.8 0.7 0.6 0.5 0.1; 
fv:100000 0N 0n 0n 0n 0n;
sAA: 0n;
sBB: 0n)
t

kt:`td xkey t
// Calculate the first row of shares owned
kt:update sAA:fv*wAA%AA, sBB:fv*wBB%BB from kt
kt

// Global variables for previous shares owned
gPrevSAA:1.0
gPrevSBB:1.0

// Function to calclate the FV. If any of the previous shares owned paraeters are
// null then use the global parameters.
calcFV:{[fv;pSAA;pxA;pSBB;pxB]
    // The first time calcFV is called, pSAA will be defined. The remainder it will be null.
    $[pSAA=0n;pSAA:gPrevSAA;pSAA:pSAA];
    $[pSBB=0n;pSBB:gPrevSBB;pSBB:pSBB];

    // Calculate the fund value
    uFV:-1;
    $[fv=0n;uFV:(pSAA*pxA)+(pSBB*pxB);uFV:fv];

    // update global values
    $[pSAA<>0n;`gPrevSAA set pSAA;];
    $[pSBB<>0n;`gPrevSBB set pSBB;];
    uFV
}

// Calculate the fund values
kt:update fv:calcFV ' [fv;prev sAA;AA;prev sBB;BB] from kt

// Update the shares owned columns with the newly calcualted fund values
kt:update sAA:fv*wAA%AA, sBB:fv*wBB%BB from kt

Solution

  • Think you have to use an an over to do this http://code.kx.com/q/ref/adverbs/#over

    q)t:([] td:2001.01.01 2001.01.02 2001.01.03 2001.01.04 2001.01.05 2001.01.06; AA:121.5 125.0 127.0 126.0 129.2 130.0; BB:111.0 115.3 117.0 116.0 119.2 120.0; wAA: 0.2 0.2 0.3 0.4 0.5 0.9; wBB: 0.8 0.8 0.7 0.6 0.5 0.1; fv:100000 0N 0n 0n 0n 0n;sAA: 0n; sBB: 0n)
    q)({update fv:fv^(AA*prev sAA) + BB*prev sBB,sAA:fv*wAA%AA,sBB:fv*wBB%BB from x}/)t
    td         AA    BB    wAA wBB fv       sAA      sBB
    ---------------------------------------------------------
    2001.01.01 121.5 111   0.2 0.8 100000   164.6091 720.7207
    2001.01.02 125   115.3 0.2 0.8 103675.2 165.8804 719.3425
    2001.01.03 127   117   0.3 0.7 105229.9 248.5745 629.5804
    2001.01.04 126   116   0.4 0.6 104351.7 331.2753 539.7503
    2001.01.05 129.2 119.2 0.5 0.5 107139   414.6246 449.4086
    2001.01.06 130   120   0.9 0.1 107830.2 746.517  89.85852
    

    Although notice values differ slightly from your sample answer, not sure why