Search code examples
sassampling

SAS - How to select random samples based on condition


I have a SAS data set that contains a column of numbers ranging from -2000 to 4000. I want to select 37 random samples based on the following conditions. If num between -2000 to -1000, randomly select 10 samples from this range, if num between -1000 to 0, randomly select 15 sample from this range, if num between 0 to 1000, randomly select 12 samples from this range,

I've tried the following:

proc surveyselect data=save.table
   method=srs n=37 out=save.table_sample seed=1953;
run;

But this would give me random 37 samples from the whole population. I want to randomly select according the data range.

Please help with SAS code, thanks so much in advance!


Solution

    1. Create a grouping variable in your data set that you can use to group analysis.

      data output;
      set save.table;
      if number < -1000 then group=1;
      else if number < 0 then group=2;
      else if number < 1000 then group=3;
      run;
      
    2. Use PROC SURVEYSELECT with either a data set that has the same variable, GROUP, as well as the sample size or list the sample size in the PROC SURVEYSELECT.

      proc surveyselect data=output
      method=srs n=37 out=save.table_sample seed=1953 sampsize=(37 15 12);
      strata group;
      run;
      

    Couldn't test because no sample data was provided, so here's an example using SASHELP.HEART

    proc sort data=sashelp.heart out=heart; by chol_status; run;
    
    
    proc surveyselect data=heart (where=(not missing(chol_status))) method=srs sampsize=(5 10 15) out=want;
    strata chol_status;
    run;