Search code examples
sasmacroslocal-variables

SAS Syntax error in macro and setting up local variables


I'm very new to SAS. I've imported data from two excel spreadsheets. From the first, I calculated new variables names Sorptivity and Kfs. This step works fine. I then merged the newly created variables, of which there were 24 with the second table, which had 237 observations. 21 of the observations in the first table corresponded to 21 unique Treatment x Block combinations in the second, so in effect the Sorptivity and Kfs values were added multiple times per Treatment x Block, corresponding to Time and CI entries (there's between 7 and 12 Time and CI entries per Treatment x Block). The 3 'missing entries' were outliers that had been removed from the second table.

I want to run a non-linear model (CI = Sorptivity * sqrt(Time) + Kfs). It must give me output for each Treatment x Block combination. I want to see as output the original values for Time, CI, Sorptivity and Kfs, as well as a predicted CI value corresponding to each time measurement.

With the help of ChatGPT I've set up the following code, but I'm running into an error, and because the erorr is within the macro, I'm getting an additional spooling error which means I can't pinpoint where the error is. I'm hoping someone can help me figure out where the issue is and how to correct it.

%macro run_model(treatment, block, time, ci, s, k);
data temp;/* Create local macro variables */
    merge infiltration  (where=(Treatment = "&treatment" and Block = "&block"))
        InfilNewData  (where=(Treatment = "&treatment" and Block = "&block"));
    by Treatment Block;
    Time = &time;
    CI = &ci;
    Sorptivity = &s;
    Kfs = &k;
run;
proc nlin data=temp;
    parms Sorptivity = &s Kfs = &k;
    model CI = Sorptivity * sqrt(Time) + Kfs;
    output out=PredictedCI_&treatment Block=Block Time=Time CI=CI Sorptivity=Sorptivity Kfs=Kfs predicted=PredictedCI;
run;
data PredictedCI_&treatment;/* Add a Treatment group identifier to the output */
    set PredictedCI_&treatment;
    Treatment = "&treatment";
    Block = "&block";
run;
%mend;

%macro run_all_models;
%local treatment_list block_list s_list k_list time_list ci_list;
/* Create macro variable lists within the macro */
proc sql noprint;
  select distinct Treatment into :treatment_list separated by ' ' from InfilDF;
  select distinct Block into :block_list separated by ' ' from InfilDF;
  select distinct Sorptivity into :s_list separated by ' ' from InfilNewData;
  select distinct Kfs into :k_list separated by ' ' from InfilNewData;
  select distinct Time into :time_list separated by ' ' from InfilDF;
  select distinct CI into :ci_list separated by ' ' from InfilDF;
quit;
/* Loop over the macro variables and run the models */
%do i = 1 %to %sysfunc(countw(&treatment_list));
    %let current_treatment = %scan(&treatment_list, &i);
    %let current_block = %scan(&block_list, &i);
    %let current_s = %scan(&s_list, &i);
    %let current_k = %scan(&k_list, &i);
    %let current_time = %scan(&time_list, &i);
    %let current_ci = %scan(&ci_list, &i);
    %run_model(&current_treatment, &current_block, &current_time, &current_ci, &current_s, &current_k);
    proc print data=PredictedCI_&current_treatment;
    run;
%end;
%mend;

%run_all_models;

In the run_model macro I initially set data temp to obtain local variables sourced only fom a combined DF called InfilDF, which had all 4 numeric variables as described above. I then tried changing it to the code you see here which pulls the variables from the two original tables. The result/error is the same:

      WHERE (Treatment='Biochar10tha') and (Block='Block1');  
22: LINE and COLUMN cannot be determined.
ERROR 22-322: Syntax error, expecting one of the following: ;, (, /, ESS, H, J, L95,
              L95M, LCL, LCLM, LMAX, OUT, P, PARMS, PRED, PREDICTED, PRES, PROJRES,
              PROJSTUDENT, PRSTUD, R, RESEXPEC, RESIDUAL, REXPEC, SSE, STDI, STDP, STDR,
              STUDENT, U95, U95M, UCL, UCLM, WEIGHT.
76: LINE and COLUMN cannot be determined.
ERROR 76-322: Syntax error, statement will be ignored.

The data and code is available at: https://github.com/AnelD13/SAS_infiltration. The two main places where I've found an error is both within the first macro in setting up the temp DF and in the model code itself. There may be an issue with how the Sorptivity and Kfs macro variables are being set up, as previous errors all pointed to this.


Solution

  • For anyone looking for a similar answer, I ended up simplifying it. The macro in my question created local variables which were meant to be entered into the nlm in an iterative manner with the intention of creating a 'by-loop'. I'm from an R background and had used a smiliar method there. Once the model looped through all the treatment and block variables, it was meant to output to a new data frame, using the variables from the initial input data frame along with the newly created variable. I sort-of got it to work on 3 spearate occasions without knowing how or why, but the predicted output was always way off, and the data frame output also had the wrong input variable values from the initial data frame.

    Based on the comments above, I tried using SAS's built-in nlin function with the by-statement to loop through Treatment and Block. I've used the by-statement in other procedures but didn't think to do it for this one before I asked the question. The following code is essentially the same procedure as in the macro, but without trying to loop it externally. This was then followed by a simple merging of the two data frames, instead of looping thorugh the merge as per the macro.

    proc nlin data=InfilDF;
        parms Sorptivity=0.01 Kfs=0.01;
        model CI = Sorptivity * sqrt(Time) + Kfs;
        by Treatment Block;
        output out=PredictedResults predicted=PredictedCI;
    run;
    data InfilPredicted;
        retain Block Treatment Time CI Infiltration Sorptivity Kfs; /* I had to do this as my Time 
                    variable showed up empty*/
        merge InfilDF PredictedResults;
        by Treatment Block;
    run;
    proc sort data=InfilPredicted;
    by Treatment Block Time;
    run;
    proc print data=InfilPredicted;
    run;
    

    This is a much simpler solution, in line with much of the simplicity seen in SAS' procedures. It prevents adding in unecessary code or variables.