Search code examples
sassampling

SAS - Survey Select - Selecting Different Sample Size per Stratum


I have a list of financial advisors and I need to pull 4 samples per advisor but catch is in those 4 samples I need to force 2 mortgages, 1 loan, 1 credit card lets say.

Is there a way in the Survey select statement to set the specific number of samples to pull per stratum? I know you can stratify on 1 category and set it as a equal number. I was hoping I could use a mapping of employee names + the number of samples left to pull for each category and have survey select utilize that to pull in a dynamic way.

I'm using this as an example but this only stratifies on employee first and gives me 4 per employee. I would need to further stratify on Product type and set that to a specific sample size per product.

proc surveyselect data=work.Emp_Table_Final
   method=srs n=4 out=work.testsample SELECTALL;
   strata Employee_No;
run;

Thanks i know it might sound complicated, but if i know its possible then i can google the rest


Solution

  • Yes, you can have a dataset be the target of the n option. That dataset must:

    • Contain the strata variables as well as a variable SAMPSIZE or _NSIZE_ with the number to select
    • Have the same type and length as the strata variables
    • Be sorted by the strata variables
    • Have an entry for every strata variable value

    See the documentation for more details.

    data sample_counts;
    length sex $1;
    input sex $ _NSIZE_;
    datalines;
    F 5
    M 3
    ;;;;
    run;
    
    proc sort data=sashelp.class out=class;
    by sex;
    run;
    
    proc surveyselect n=sample_counts method=srs out=samples data=class;
    strata sex;
    run;
    

    For two variables it's the same, you just need two variables in the sample_counts. Of course it makes it a lot more complicated, and you may want to produce this in an automated fashion.

    proc sort data=sashelp.class out=class;
    by sex age;
    run;
    
    data sample_counts;
    length sex $1;
    input sex $ age _NSIZE_;
    datalines;
    F 11 1
    F 12 1
    F 13 1
    F 14 1
    F 15 1
    M 11 1
    M 12 1
    M 13 1
    M 14 1
    M 15 1
    M 16 0
    ;;;;
    run;
    
    /* or do it in an automated way*/
    
    data sample_counts;
      set class;
      by sex age;            *your strata;
      if first.age then do;  *do this once per stratum level;
        if age le 15 then _NSIZE_ = 1;  *whatever your logic is for defining _NSIZE_;
        else _NSIZE_=0;
        output;
      end;
    run;
    
    
    
    proc surveyselect n=sample_counts method=srs out=samples data=class;
    strata sex age;
    run;