Search code examples
sasdatastep

How to create binary column that compares a value to the total average of the same column such that if nb > average=1 & nb < average=0


I'm working on an assignment & not allowed to use PROC SQL & must use DATA STEP or other PROCs. I have a dataset (Table A), & I'm trying to calculate a few things at the same time and I'm not any closer to my answer after spending few hours on r/sas, stackoverflow & YouTube. I need to create a binary column in Table A that =1 if the value in Column 3 >= the average of Column 3, but grouped by country. In PROC SQL it's quite simple:

PROC SQL;
CREATE TABLE Earl.A AS

SELECT * ,
CASE WHEN A.new_cases > AVG(A.new_cases) 
THEN 1 ELSE 0
END AS Global_Average_Comparison
FROM Earl.A AS A
QUIT;

But in SAS data steps....I feel a bit like a deer in the headlights. I tried using a PROC MEANS, but that creates a new table & any attempt to merge it back to Table A brings back columns I dropped in a previous query that are irrelevant:

PROC MEANS DATA=EARL.A;
    BY country;
    VAR new_cases;
output out=means(drop=_type_ _freq_) mean=mean;
RUN;

I'm realllllllllly new to SAS data steps but well versed in SQL :( so any guidance would be much appreciated.

Thank you!


Solution

  • You didn't show the attempts to merge, but the process is pretty simple.

    Example:

    data have;
      call streaminit(123);
      do country = 'A', 'B', 'C';
      do _n_ = 1 to 100 + rand('integer',50);
        x = rand('integer', 25, 75);
        output;
      end;
      end;
    run;
    
    proc means noprint data=have;
      by country;
      var x;
      output out=means(keep=country x_mean) mean=x_mean;
    run;
    
    data want;
      merge have means;
      by country;
    
      above_average_flag = x > x_mean;
    
      * drop x_mean;
    run;
    

    enter image description here