I am very new to SAS and I am trying to generate a population data set of categorical variables. I need to get a data set with 400 observations and 99 variables. The first column (Variable 1) will have 4 1's and 396 0's, the second column (Variable 2) will have 8 1's and 392 0's, and so on and so forth until the last column (Variable 99) will have 396 1's and 4 0's. I have been trying to generate this data set but had no luck so far. I believe I have to make use of MACROS and DO-LOOPS, ARRAYS and maybe even nested LOOPS.
So far this is what I have but I am pretty I am far from the actual solution;
DATA population;
ARRAY pop V1-V99;
DO N=1 TO 400;
DO i=1 TO dim(pop);
pop(i)=.....;
END;
DROP i;
DROP N;
END;
RUN;
Not really sure how this will help, but this seems to get you there:
First create the rows/columns values in a long list and then flip to a wide structure as desired. This is dynamic and easily modified for any number of rows/columns or selection of 1s. The 1's are just selected in order, you didn't specify if they needed to be random or sequential.
data have;
*loop over 99 columns;
do col=1 to 99;
*create row values, using 4 rule and basic math for loop counting;
do row=1 to 400;
if row <= col*4 then val=1;
else val=0;
output;
end;
end;
run;
*sort for transpose;
proc sort data=have;
by row col;
run;
*flip to desired structure;
proc transpose data=have out=want prefix=COL;
by row;
var VAL;
id col;
run;
*check # of 1's per col;
proc means data=want N SUM;
var COL1-COL99;
run;