Hopefully somebody can help me with my problem.
For the company I work for, I need to check if there is any data available in the dataset which is delivered monthly to me. If there is no data available then SAS has to stop the script and give an error (this is not the problem, the problem is before that part). The problem is as follows: We work with a very large dataset. New data is compared to old data so the new data and the 11 months before that are checked. My fellow epidemiologists and I write down the months we want to check in a predesigned macro-variable called 'deliveryfile'. This cannot be changed so this macrovariable needs to be included in the script. I want to make a loop that checks if a month (or months) is present in a dataset. If it is present, nothing will happen. If it is not present then the month is added to the pre-existing dataset.
Example dataset and macrovariable:
DATA existingdataset;
INPUT yearmonth total;
DATALINES;
202108 400
202109 0
202110 450
;
RUN;
%LET deliveryfile = 202110, 202111;
With the code above, I need to write a script which can check if month 202110 and 202111 are already present in the existingdataset. 202110 is present, so nothing will happen. 202111 is not present so it needs to be added to 'existingdataset'. Just the month needs to be added, the variable 'total' would be empty/NULL.
This is what I came up with, but I got several errors, most have to do with how I use PROC SQL. Number 1:
PROC SQL NOPRINT;
SELECT DISTINCT yearmonth INTO :yearmonth SEPARATED BY ', ' FROM existingdataset;
QUIT;
%PUT NOTE: yearmonth : &yearmonth; /*For verification */
%MACRO ADDING;
DATA existingdataset;
SET existingdataset;
IF
%DO i=1 %TO %SYSFUNC(COUNTW("&deliveryfile "));
%LET ymcontrol=%SCAN(%QUOTE(&deliveryfile ),&i,%STR( ));
&ymcontrol NOT IN (&yearmonth)
%END
THEN;
PROC SQL;
INSERT INTO existingdataset
SET yearmonth= &ymcontrol;
QUIT;
END;
RUN;
%MEND ADDING;
%ADDING;
Then I thought, maybe I can use PROC SQL, IF THEN and a DO loop in 1 go, but that doesn't work either.
PROC SQL NOPRINT;
SELECT DISTINCT yearmonth INTO :yearmonth SEPARATED BY ', ' FROM existingdataset;
QUIT;
%PUT NOTE: yearmonth : &yearmonth; /*For verification */
%MACRO ADDING;
PROC SQL;
IF
%DO i=1 %TO %SYSFUNC(COUNTW("&deliveryfile "));
%LET ymcontrol=%SCAN(%QUOTE(&deliveryfile ),&i,%STR( ));
&ymcontrol NOT IN (&yearmonth)
%END
THEN;
INSERT INTO existingdataset
SET yearmonth= &ymcontrol;
END;
QUIT;
%MEND ADDING;
%ADDING;
I hope somebody has an answer to my problem.
If somebody has a better idea to do everything in 1 go (check for the months, make a table which visualizes which month or months are incorrect, and stopping the script) I would be open for that too. I will explain now what the script needs to do entirely. Eventually, what the script needs to do, is to check (1) if the month (deliveryfile) is present in the dataset (existingdataset) and (2) is there a month present in the dataset which has a total number of 0 or nothing. If there is a month present with a total number of 0 or nothing then the script needs to (1) make a table with only the month or months with a total number of 0 and nothing into the log, (2) put a note in de log which states "NOTE: Syntax has stopped" and (3) quits the rest of the script but not aborts SAS entirely. If all is good then the script continues with the rest. I already tried to write something down, but it isn't finished yet.
DATA _NULL_;
%IF total = 0 %THEN %DO;
PROC SQL;
CREATE TABLE nototal_details AS
SELECT DISTINCT yearmonth, total
FROM existingdataset
WHERE total= 0
ORDER BY yearmonth DESCENDING;
QUIT;
/* Still need to write a part that puts that table into the LOG*/
%PUT NOTE: Syntax has stopped.;
%ABORT CANCEL;
%END;
RUN;
So, in the example I present above, eventually the script needs to give this table in the log before SAS is stopped: |Months without information|Total| |:-------------------------|----:| |202109 | 0 | |202111 | NULL|
Sounds like you just need to create a dataset with the list of values you want to force to exist and merge it with your actual data.
data required;
do yearmonth = 202110, 202111 ;
output;
end;
run;
data want;
merge existingdataset required;
by yearmonth;
run;
So if you have this macro variable:
%LET deliveryfile = 202110, 202111;
That is perfect to use to generate the DO statement needed.
data required;
do yearmonth = &deliveryfile ;
output;
end;
run;
So no macro code (other than the macro variable reference) or SQL needed.