Since the the length of the value of a macro variable cannot exceed the maximum length of (65534), I can't create a single macro variable for all of my observations. I would like to create a macro to iterate through my data set to generate several numeric lists that I can pass to a where statement in proc sql.
Instead of this:
*proc sql noprint;
select ID into :ParaList separated by ','
from work.ID_LIST(**firstobs=1 obs=5000**);
quit;*
*proc sql noprint;
select ID into :ParaList2 separated by ','
from work.ID_LIST(**firstobs=5001 obs=10000**);
quit;*
*proc sql noprint;
select ID into :ParaList3 separated by ','
from work.ID_LIST(**firstobs=10001 obs=15000**);
quit;*
*proc sql noprint;
select ID into :ParaList4 separated by ','
from work.ID_LIST(**firstobs=15001 obs=20000**);
quit;*
I'd like something like:
*proc sql noprint;
select ID into :List1-Last4 separated by ','
from work.ID_LIST(**firstobs=1 obs=&LASTOBS** BY 5000);
quit;*
I'd like to create a macro to loop through every 5000 observations or so until last observation that I can pass into a where statement such as where id in (&ParaList,&ParaList2,&ParaList3,&ParaList4). I know there are alternatives such as
id in (select id from work.table)
but in this case, it doesn't work. I am querying Hadoop through SAS and haven't had any success except passing macro variable lists.
You could easily use a data step to generate the macro variables. You should also generate a macro variable that calls all of the other macro variables.
%let n_per_list=5 ;
data _null_;
length idlist $32000;
length macrolist $1000 ;
retain macrolist;
do i=1 to &n_per_list until (eof);
set id_list end=eof;
idlist=catx(',',idlist,id);
end;
listno+1;
call symputx(cats('paralist',listno),idlist);
macrolist=catx(',',macrolist,cats('&','paralist',listno));
call symputx('paralist',macrolist);
run;
Which for a simple test of 20 values split into groups of 5 yields this result:
151 %put Paralist=%superq(ParaList);
Paralist=¶list1,¶list2,¶list3,¶list4
152 %put &=Paralist1;
PARALIST1=1,2,3,4,5
153 %put &=Paralist;
PARALIST=1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
Or you could just look at generating the code into a macro instead of using macro variables. That should not have any limits on how long a list you could generate. You could try to figure out how to open the dataset while inside a macro without generating any SAS code so that the result of the macro call is just the list of values. But it would be much easier to generate the source code for the macro definition to a file and then %include the file to define it.
filename code temp;
data _null_;
set id_list end=eof;
file code lrecl=80;
if _n_=1 then put '%macro paralist;' ;
else put ',' @ ;
put id @ ;
if eof then put / '%mend paralist;' ;
run;
Result for same trivial 20 value list.
163 %include code / source2;
NOTE: %INCLUDE (level 1) file CODE is file /.../#LN00043.
164 +%macro paralist;
165 +1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20
166 +%mend paralist;
NOTE: %INCLUDE (level 1) ending.
167 %put %paralist;
1 ,2 ,3 ,4 ,5 ,6 ,7 ,8 ,9 ,10 ,11 ,12 ,13 ,14 ,15 ,16 ,17 ,18 ,19 ,20