Search code examples
sasdatasteppopulation

Add age group populations together SAS


I am trying to add the populations of a few age groups (40-44, 45-49, 50-55) together by state and year to create a greater than 40 population. I cannot figure out how to do this within sas. Any help would be appreciated!

Sample data:

data have;
input state$ year age_grp$ pop;
datalines;
Alabama 2010 40-44 12345
Alabama 2010 45-49 23456
Alabama 2010 50-54 13452
Alabama 2011 40-44 22342
Alabama 2011 45-49 11234
Alabama 2011 50-54 43213
Arizona 2010 40-44 11111
Arizona 2010 45-49 22222
Arizona 2010 50-54 11221
Arizona 2011 40-44 12312
Arizona 2011 45-49 21212
Arizona 2011 50-54 12312
;
run;

I am trying to do this within the datastep and create a gt40_pop variable that I can use while keeping the original data for subsequent analyses.

Thank you in advance!


Solution

  • The most efficient way to do this would be to create a format and apply it when needed. You can then sum up your variables with something like PROC MEANS or PROC SQL to create those sums without needing to create any extra variables. For example:

    proc format;
        value $fortyplus
            '0-4'   = '0-4'
            '5-9'   = '5-9'
            '10-14' = '10-14'
            '15-19' = '15-19'
            '20-24' = '20-24'
            '25-29' = '25-29'
            '30-34' = '30-34'
            '35-39' = '35-39'
            other   = '40+'
        ;
    run;
    

    Now you can apply it in PROCS or in SQL:

    proc means data=have noprint;
        format age_grp $fortyplus.;
        class state year age_grp;
        ways 3;
        output out=want 
            sum(pop) = total_pop
        ;
    run;
    

    Or:

    proc sql;
        create table want as
            select state
                 , year
                 , put(age_grp, $fortyplus.) as age_grp 
                 , sum(pop) as total_pop
            from have
            group by state, year, calculated age_grp
        ;
    quit;
    
    state   year    age_grp total_pop
    Alabama 2010    40+     49253
    Alabama 2011    40+     76789
    Arizona 2010    40+     44554
    Arizona 2011    40+     45836