Search code examples
arrayssassas-macro

Summing variable by naming them with arrays as prefix


In the following code, I have several variables linked to a day of the week. I wish to regroup by sum all the values concerned by the same day (eg. sales_monday1 and sales_monday2 sumed in a new variable named Monday). For this purpose, I thought to use an array :

    data test;
input sales_monday1 sales_monday2 sales_tuesday sales_wednesday;
datalines;
1 1 2 .
2 5 6 .
3 20 . 1
;
run;

Data test;
    ARRAY weekDays{*} Monday Tuesday Wednesday Thursday Friday Saturday Sunday;
    set test;

    do i=1 to DIM(weekDays);
        weekDays{i}= sum(of sales_ weekDays[i]:);
    end;
    drop i;
run;

My problem is : I cannot refer to my array in a sum function considering it is part of a concatenated variable name weekDays{i}= sum(of sales_ weekDays[i]:);. Does a solution to that matter exist?


Solution

  • Arrays don't quite work like that. You're using them as if they're macro variables, when they're not.

    You can't, thus, use the : suffix on the macro variable. You can do this instead one of a few ways.

    First, you could simply put all of the sales variables into an array, and loop over both, in whichever order you like, and compare variable names using VNAME and some help.

    Data test_sum;
        ARRAY weekDays{*} Monday Tuesday Wednesday Thursday Friday Saturday Sunday;
        set test;
        array sales sales_:;
    
        do i=1 to DIM(weekDays);
           do j = 1 to dim(sales);
             if upcase(compress(scan(vname(sales[j]),2,'_'),,'ka')) = upcase(vname(weekdays[i]))
                then weekdays[i] = sum(weekdays[i],sales[j]);
           end;
        end;
        drop i j;
    run;
    

    It's quite inefficient as you do a ton of unnecessary comparisons, so perhaps this isn't for you if you have a large dataset. It's probably the right answer for a small dataset.

    For a large dataset, you should do this with the macro language instead.

    %macro sum_weekday(name=);
      %let weekday = %sysfunc(compress(%sysfunc(scan(&name.,2,_)),,ka));
      &weekday. = sum(&weekday., &name.);
    %mend sum_weekday;
    
    proc sql;
      select cats('%sum_weekday(name=',name,')') into :sumlist separated by ' '
        from dictionary.columns
        where memname='TEST' and libname='WORK'
      ;
    quit;
    
    data test_macro;
       set test;
       &sumlist;
    run;
    

    This is pretty efficient as it only looks at the list of variables once, not once per row. Basically, it just creates however many statements like

    monday = sum(monday, sales_monday1);
    monday = sum(monday, sales_monday2);
    tuesday= sum(tuesday,sales_tuesday);
    

    Etc., based on dictionary.columns which is the list of variables in all tables in your SAS. (This may be slow if you're in a server environment with lots of libraries defined via the metadata server; you have other ways to do this, then.)