I have a sorted list v of around 20K items. I want to split it into 2 lists at the point where first v[i]>K
N:20000;
v:asc N?100000; / N random numbers sorted
K:200; / threshold
v1:v[where v<=K]; / "v<=K" has O(N) complexity, "where" has O(N) too
v2:(count v1) _ v; / list is sorted, this holds.
Question: how to avoid v<=200, so it does not calculate whole intermediate boolean vector of length N, in other words, does not compare values after first match found? I actually need an index to perform split at. Assume K is found close to the beginning of a list.
This is performance-related question. (NB Ignore time spent on "asc".)
To avoid calculating the list of booleans, you can take advantage of the fact your list is sorted and use binr
:
c:v binr K //42
v1:c # v
v2:c _ v
This greatly improves the speed of operation:
q)\ts:10000 v1:v[where v<=K];v2:(count v1) _ v
680 262608
q)\ts:10000 c:v binr K;v1:c # v;v2:c _ v
75 262560