Search code examples
xcodesascoding-style

Correction of the Interpretation of the SAS code


I am pretty new to sas. Can you please help me to interpret the following lines of code:

proc means data=crsp1 noprint;
var ret;
by gvkey datadate year;
output out=exec_roll_vol_fyear n=nrollingstd std=rollingstd;
run;

data volatility;
set exec_roll_vol_fyear;
where &start_year <= year <= &end_year;
* we have volatility of monthly returns,
converting to annual volatility;
estimated_volatility=rollingstd*(12**0.5);
proc sort nodupkey;
by gvkey year;
run;

Does it mean the following: take data "crsp1" and create a dataset "exec_roll_vol_fyear" that will contain rolling standard deviation of "ret"? (I dont quite see what "proc means" stands for here)

Second part: use data "exec_roll_vol_fyear" to create a data set "volatility", where estimated_volatility=rollingstd*(12**0.5) and drop duplicates of gvkey year. Am I right?


Solution

  • PROC MEANS is a summarization procedure that summarizes data. In this case, it will calculate the n and standard deviation for each unique combination of gvkey datadate year, and output to a dataset exec_roll_vol_fyear. This might be a "rolling" standard deviation if the incoming data is structured appropriately to do that (basically, if datadate defines the rolling windows and if any given record is duplicated once for each window it falls in); impossible to tell. There are better tools for time series analysis in SAS, though.

    Then, the data step applies a formula to create a new variable from the standard deviation, and then it sorts the resulting dataset removing duplicates by gvkey and year.