Search code examples
kdb+medianweighted

KDB: weighted median


How can one compute weighted median in KDB?

I can see that there is a function med for a simple median but I could not find something like wmed similar to wavg.

Thank you very much for your help!


Solution

  • For values v and weights w, med v where w gobbles space for larger values of w.

    Instead, sort w into ascending order of v and look for where cumulative sums reach half their sum.

    q)show v:10?100
    17 23 12 66 36 37 44 28 20 30
    q)show w:.001*10?1000
    0.418 0.126 0.077 0.829 0.503 0.12 0.71 0.506 0.804 0.012
    q)med v where "j"$w*1000
    36f
    
    q)w iasc v / sort w into ascending order of v
    0.077 0.418 0.804 0.126 0.506 0.012 0.503 0.12 0.71 0.829
    q)0.5 1*(sum;sums)@\:w iasc v / half the sum and cumulative sums of w
    2.0525
    0.077 0.495 1.299 1.425 1.931 1.943 2.446 2.566 3.276 4.105
    q).[>]0.5 1*(sum;sums)@\:w iasc v / compared
    1111110000b
    q)v i sum .[>]0.5 1*(sum;sums)@\:w i:iasc v / weighted median
    36
    
    q)\ts:1000 med v where "j"$w*1000
    18 132192
    q)\ts:1000 v i sum .[>]0.5 1*(sum;sums)@\:w i:iasc v
    2 2576
    
    q)wmed:{x i sum .[>]0.5 1*(sum;sums)@\:y i:iasc x}
    

    Some vector techniques worth noticing:

    • Applying two functions with Each Left (sum;sums)@\: and using Apply . and an operator on the result, rather than setting a variable, e.g. (0.5*sum yi)>sums yi:y i or defining an inner lambda {sums[x]<0.5*sum x}y i
    • Grading one list with iasc to sort another
    • Multiple mappings through juxtaposition: v i sum ..