Search code examples
sasmacrosdo-loops

Creating macro lists of variable names based on different criteria


I have a very wide dataset of population denominators by age and race that (simplified for this question) looks like this:

enter image description here

the variable name carries the information of who the population number describes by race & age for each zip code. so w_1 are white 1 year olds, b_1 are black one year olds, a for asian and this goes from ages 1-100.

my task is to sum the population at each zip code by various different cuts. for example ages 5+, 5-11, 12-17, 65+, by each race group, by race*age group. Naturally, I don't want to type out w_1 + w_2 + ...+ w_100 for all the different categories.

My strategy is to pull the relevant variable names into a macro list for each category that I want a population sum. I've already pulled the variable names into a dataset called "varname" and used substr() to assign the variable name age & rage data so it can be easily filtered. Now I want to filter that varname dataset for each race type and create a macro list of all the variables. here's what my code looks like:

    %let racecat = white black aian asian nhppi latinx;
    %let race = 'W' 'B' 'I' 'A' 'P' 'H';

    %macro varlist_race;
    %let varnum= %sysfunc(countw(&racecat.));
    %do i=1 %to &varnum.;
        %let nextgroup= %scan(&racecat.,&i);

        proc sql;
            select strip(name)
            into :&nextgroup separated by ","
            from varname
            where race =%scan(&race.,&i) ;
        quit;
    %end; %mend varlist_race;
    %varlist_race;

what I expected was a macro list for each racecat like &white &black etc. When I run this code my results viewer is printing out the multiple variable name lists correctly but the macro lists don't seem to be saving. for example I'm getting this error afterward.

%put &white;

WARNING: Apparent symbolic reference WHITE not resolved.

the end usage will look like this:

    data pop2019_sums;
    set pop2019;
     pop_white = sum(white);
     pop_black = sum(&black);
     pop_asian= sum(&asian);
    run;

I have no clue why these macro lists aren't saving after the do loop. any feedback is greatly appreciated!


Solution

  • Macro variables have a concept known as scope. Local macro variables only exist within the macro and do not exist once the macro ends.

    You have to explicitly change it to global so it exists outside the macro by using the %GLOBAL statement.

    %let racecat = white black aian asian nhppi latinx;
    %let race = 'W' 'B' 'I' 'A' 'P' 'H';
    
    %macro varlist_race;
    %let varnum= %sysfunc(countw(&racecat.));
    %do i=1 %to &varnum.;
        %global %scan(&racecat.,&i);
        %let nextgroup= %scan(&racecat.,&i);
    
    
        proc sql;
            select strip(name)
            into :&nextgroup separated by ","
            from varname
            where race =%scan(&race.,&i) ;
        quit;
    %end; 
    %mend varlist_race;
    %varlist_race;
    

    That being said, this is very cumbersome and inefficient. If you transpose your data so that you have it structured as:

      ZipCode  Age  Race  Count
    

    Then you can quite easily summarize for various age groups easily, using proc summary/means including any overlapping ranges as well using a multilabel format, examples here.