Search code examples
sqlsasstatisticsmedianproc-sql

How to calculated median with weights using SAS proc sql?


To calculate the mean of variable "value" with weight, we can simply do the following:

proc sql;
select sum(Value*Weight)/sum(Weight) as WeightedAverage
from table ;
quit; 

How can we calculate median of the variable "value" with weight? I know we can use proc means, but I would prefer a solution in proc sql.

Example:

data table;
input value weight;
datalines;
1 1
2 1
3 2
;
run;

Using proc means, we can get the mean (2.25) and median (2.5) easily:

proc means data=table mean median; 
var value;
weight weight;
run;

Using above proc sql, I can get the weighted average: 2.25.

How to get the median (which is 2.5)?


Solution

  • There's also a median() function in proc sql (available in SAS 9.4).

    A normal median would be like this:

    proc sql; select median(Value) as Median from table ; quit;
    

    A weighted Median would be something like this depending on how you want to use the weights:

    proc sql; select median(Value*Weight) as WeightedMedian from table ; quit;
    

    Example: The Median will be 2.5

    data have;
    input value;
    datalines;
    1 
    2 
    3 
    3
    ;
    run;
    proc sql;
    create table want as 
    select Median(value) as med
    from have;
    quit;
    

    Example: Segregate the data, so we have 4 row 1,2,3,3 instead of 3

    data have;
    input value weight;
    datalines;
    1 1
    2 1
    3 2
    ;
    run;
    
    data have_seg;
    set have;
    retain freq;
    freq= weight;
    if(freq=1) then do; output; end;
    else if freq > 1 then do; 
    do i=1 to freq; weight=1; output; end;
    end;
    keep value weight;
    run;