Search code examples
kdbq-lang

KDB performance: searching first item quicky


I have a sorted list v of around 20K items. I want to split it into 2 lists at the point where first v[i]>K

N:20000;
v:asc N?100000;     / N random numbers sorted
K:200;              / threshold
v1:v[where v<=K];   / "v<=K" has O(N) complexity, "where" has O(N) too
v2:(count v1) _ v;  / list is sorted, this holds.

Question: how to avoid v<=200, so it does not calculate whole intermediate boolean vector of length N, in other words, does not compare values after first match found? I actually need an index to perform split at. Assume K is found close to the beginning of a list.

This is performance-related question. (NB Ignore time spent on "asc".)


Solution

  • To avoid calculating the list of booleans, you can take advantage of the fact your list is sorted and use binr:

    c:v binr K  //42
    v1:c # v
    v2:c _ v
    

    This greatly improves the speed of operation:

    q)\ts:10000 v1:v[where v<=K];v2:(count v1) _ v
    680 262608
    q)\ts:10000 c:v binr K;v1:c # v;v2:c _ v
    75 262560