I'm new to SAS, and I'm having trouble with Do Loops. Below is a snippet of the code.
DATA _NULL_;
set testing;
do i=_N_ to 4;
call symputx(NAME,"E&_N_");
end;
%PUT =&E&_N_;
run;
I'm expecting it to output something below:
E1 = A
E2 = B
E3 = C
E4 = D
However, I can't seem to get it to work. Any ideas what I'm doing wrong? Thanks for you help.
Focusing first on the issue of looping, it's important to know that the DATA step itself is a loop.
Given sample data:
data have;
input name $1;
cards;
A
B
C
D
;
run;
If you code:
data _null_;
set have;
run;
you have coded a loop, and the data step loop will iterate five times (not four). On the first iteration the SET statement reads the first record, on the second iteration it reads the second record, ... on the fifth iteration the SET statement tries to read a fifth record but it hits the end of the file and the DATA step stops. The easiest way to see this iteration is to add PUT statements.:
46 data _null_;
47 put "top of loop " _n_= ;
48 set have;
49 put "bottom of loop " _n_= name= /;
50 run;
top of loop _N_=1
bottom of loop _N_=1 name=A
top of loop _N_=2
bottom of loop _N_=2 name=B
top of loop _N_=3
bottom of loop _N_=3 name=C
top of loop _N_=4
bottom of loop _N_=4 name=D
top of loop _N_=5
NOTE: There were 4 observations read from the data set WORK.HAVE.
There are times it is useful to use an explicit DO loop to read data, rather than rely upon the implicit do loop. It's not clear this would be useful in this case, but you could do it like:
52 data _null_;
53 do _n_=1 to 4;
54 put "top of loop " _n_= ;
55 set have;
56 put "bottom of loop " _n_= name= /;
57 end;
58 run;
top of loop _N_=1
bottom of loop _N_=1 name=A
top of loop _N_=2
bottom of loop _N_=2 name=B
top of loop _N_=3
bottom of loop _N_=3 name=C
top of loop _N_=4
bottom of loop _N_=4 name=D
top of loop _N_=1
NOTE: There were 4 observations read from the data set WORK.HAVE.
In that case, there is still an implicit DATA step loop. But on the first iteration of the implicit DATA step loop, you execute the SET statement four times inside the explicit DO loop, reading all four records. On the second iteration of the DATA step loop, the SET statement tries to read a fifth record and hits the end of file so the DATA step completes.
Understanding the implicit looping of the DATA step is essential for beginning SAS programmers to understand. Most people recommend that beginning programmers should AVOID learning the macro language, because it is a different language than the DATA step language. Since the macro language is (typically) used to generate SAS language code, you need to have a solid understanding of the SAS language before you learn the macro language.
That said, if your goal is to create four macro variables named E1 E2 E3 E4 which will resolve to the values A B C D respectively, you can do it with CALL SYMPUTX. The first argument to CALL SYMPUTX is the name of the macro variable to be created, and the second argument is the value to be assigned to the macro variable. So you could do it like:
data _null_;
set have;
call symputx(cats("E",_N_),Name);
run;
%put E1=&E1 E2=&E2 E3=&E3 E4=&E4;
That uses the CATS() function to compute the name of the macro variable to be generated (the letter "E" concatenated with the value of N), and assigns the value of the variable DATA step variable NAME to the macro variable. There are other ways to create such lists of macro variables (aka macro variable arrays). Importantly, the %PUT statement is after the RUN statement. This is because the macro language statement %PUT is not part of the DATA step language.