Search code examples
arrayssasnested-loopssas-macrodo-loops

Generating a Population Data Set in SAS


I am very new to SAS and I am trying to generate a population data set of categorical variables. I need to get a data set with 400 observations and 99 variables. The first column (Variable 1) will have 4 1's and 396 0's, the second column (Variable 2) will have 8 1's and 392 0's, and so on and so forth until the last column (Variable 99) will have 396 1's and 4 0's. I have been trying to generate this data set but had no luck so far. I believe I have to make use of MACROS and DO-LOOPS, ARRAYS and maybe even nested LOOPS.

So far this is what I have but I am pretty I am far from the actual solution;

DATA population;
    ARRAY pop V1-V99;
        DO N=1 TO 400;
           DO i=1 TO dim(pop);
               pop(i)=.....;
           END;
        DROP i;
        DROP N;
        END;
RUN;

Solution

  • Not really sure how this will help, but this seems to get you there:

    First create the rows/columns values in a long list and then flip to a wide structure as desired. This is dynamic and easily modified for any number of rows/columns or selection of 1s. The 1's are just selected in order, you didn't specify if they needed to be random or sequential.

     data have;
     *loop over 99 columns;
     do col=1 to 99;
         *create row values, using 4 rule and basic math for loop counting;
         do row=1 to 400;
             if row <= col*4 then val=1; 
             else val=0;
             output;
         end;
    end;
    run;
    
    *sort for transpose;
    proc sort data=have;
        by row col;
    run;
    
    *flip to desired structure;
    proc transpose data=have out=want prefix=COL;
    by row;
    var VAL;
    id col;
    run;
    
    *check # of 1's per col;
    proc means data=want N SUM;
    var COL1-COL99;
    run;