How can one compute weighted median in KDB?
I can see that there is a function med for a simple median but I could not find something like wmed
similar to wavg.
Thank you very much for your help!
For values v
and weights w
, med v where w
gobbles space for larger values of w
.
Instead, sort w
into ascending order of v
and look for where cumulative sums reach half their sum.
q)show v:10?100
17 23 12 66 36 37 44 28 20 30
q)show w:.001*10?1000
0.418 0.126 0.077 0.829 0.503 0.12 0.71 0.506 0.804 0.012
q)med v where "j"$w*1000
36f
q)w iasc v / sort w into ascending order of v
0.077 0.418 0.804 0.126 0.506 0.012 0.503 0.12 0.71 0.829
q)0.5 1*(sum;sums)@\:w iasc v / half the sum and cumulative sums of w
2.0525
0.077 0.495 1.299 1.425 1.931 1.943 2.446 2.566 3.276 4.105
q).[>]0.5 1*(sum;sums)@\:w iasc v / compared
1111110000b
q)v i sum .[>]0.5 1*(sum;sums)@\:w i:iasc v / weighted median
36
q)\ts:1000 med v where "j"$w*1000
18 132192
q)\ts:1000 v i sum .[>]0.5 1*(sum;sums)@\:w i:iasc v
2 2576
q)wmed:{x i sum .[>]0.5 1*(sum;sums)@\:y i:iasc x}
Some vector techniques worth noticing:
(sum;sums)@\:
and using Apply .
and an operator on the result, rather than setting a variable, e.g. (0.5*sum yi)>sums yi:y i
or defining an inner lambda {sums[x]<0.5*sum x}y i
iasc
to sort anotherv i sum ..