Search code examples
sasodssgplot

SAS problem with curvelabelpos and xaxis in PROC SGPLOT


I am currently trying to use PROC SGPLOT in SAS to create a series plot with five lines (8th grade, 10th grade, 12th grade, College Students, and Young Adults). The yaxis is a percentage of prevalence in drug use ranging from 0-100. The xaxis is the year 1975-2019, but formatted (using proc format) so that it shows the value of year as '75-'19. I would like to label each line using its respective group (8th grade - Young Adult). But when I use:

proc sgplot data = save.fig2_1data noautolegend ;
series x=year y=eighth / lineattrs=(color=orange) curvelabel='8th Grade' curvelabelpos=start ;
series x=year y=tenth / lineattrs=(color=green) curvelabel='10th Grade' curvelabelpos=start ;
series x=year y=twelfth / lineattrs=(color=blue) curvelabel='12th Grade' curvelabelpos=start;
series x=year y=college / lineattrs=(color=red) curvelabel='College Students' curvelabelpos=start;
series x=year y=youngadult / lineattrs=(color=purple) curvelabel='Young Adults' curvelabelpos=start ;
xaxis label="YEAR" values=(1975 to 2019 by 2) minor;
yaxis label="PERCENT" max=100 min=0 ;
format year yr. ; run ;

Series Plotenter image description here

The "curvelabelpos=" does not give the option to place my label above the first data point of "12th Grade" and "College Students" so that my xaxis does not have all of the space on the left side of the plot. How do I move these two labels above the first data point of each line so that the xaxis does not have empty space?


Solution

  • There are no series statement options that will produce the labeling you want.

    You will have to create an annotation data set for the sgplot.

    In this sample code the curvelabel= option was set to '' so the procedure generates a series line that uses the widest amount of horizontal drawing space. The sganno data set contains the annotation functions that will draw your own curvelabel text near the first data point of the series with the blank curvelabel. Adjust the %sgtext anchor= value as needed. Be sure to read the SG Annotation Macro Dictionary documentation to understand all the text annotation capabilities.

    For the case of wanting an artificial split in the series lines there are two things to try:

    • introduce a fake year, 2012.5, for which none of the series variables have a value. I tried this but only 1 of 5 series drew with a 'fake' split.
    • introduce N new variables for the N lines needing a split. For the post split time frame copy the data into the new variables and set the original to missing.
      • add SERIES statements for the new variables.
    data have;
      call streaminit(1234);
    
      do year = 1975 to 2019;
        array response eighth tenth twelfth college youngadult;
    
        if year >= 1991 then do;
          eighth = round (10 + rand('uniform',10), .1);
          tenth = eighth + round (5 + rand('uniform',5), .1);
          twelfth = tenth + round (5 + rand('uniform',5), .1);
    
          if year in (1998:2001) then tenth = .;
        end;
        else do;
          twelfth = 20 + round (10 + rand('uniform',25), .1);
        end;
    
        if year >= 1985 then do;
          youngadult = 25 + round (5 + rand('uniform',20), .1);
        end;
    
        if year >= 1980 then do;
          college = 35 + round (7 + rand('uniform',25), .1);
        end;
    
        if year >= 2013 then do _n_ = 1 to dim(response);
          %* simulate inflated response level;
          if response[_n_] then response[_n_] = 1.35 * response[_n_];
        end;
    
        output;
      end;
    run;
    
    data have_split;
      set have;
      array response  eighth  tenth  twelfth  college  youngadult;
      array response2 eighth2 tenth2 twelfth2 college2 youngadult2;
    
      if year >= 2013 then do _n_ = 1 to dim(response);
        response2[_n_] = response[_n_];
        response [_n_] = .;
      end;
    run;
    
    ods graphics on;
    ods html;
    
    %sganno;
    
    data sganno;
      %* these variables are used to track '1st' or 'start' point 
      %* of series being annotated
      ;
      retain y12 ycl;
    
      set have;
      if missing(y12) and not missing(twelfth)  then do; 
        y12=twelfth;
        %sgtext(label="12th Grade", textcolor="blue", drawspace="datavalue", anchor="top", x1=year, y1=y12, width=100, widthunit='pixel')
      end;     
    
      if missing(ycl) and not missing(college) then do; 
        ycl=college; 
        %sgtext(label="College Students", textcolor="red", drawspace="datavalue", anchor="bottom", x1=year, y1=ycl, width=100, widthunit='pixel')
      end;
    run;
    
    
    proc sgplot data=have_split noautolegend sganno=sganno;
    series x=year y=eighth     / lineattrs=(color=orange) curvelabel='8th Grade'        curvelabelpos=start;*auto curvelabelloc=outside ;
    series x=year y=tenth      / lineattrs=(color=green)  curvelabel='10th Grade'       curvelabelpos=start;*auto curvelabelloc=outside ;
    series x=year y=twelfth    / lineattrs=(color=blue)   curvelabel='' curvelabelpos=start;*auto curvelabelloc=outside ;
    series x=year y=college    / lineattrs=(color=red)    curvelabel='' curvelabelpos=start;*auto curvelabelloc=outside ;
    series x=year y=youngadult / lineattrs=(color=purple) curvelabel='Young Adults'     curvelabelpos=start;*auto curvelabelloc=outside ;
    
    * series for the 'shifted' time period use the new variables;
    series x=year y=eighth2     / lineattrs=(color=orange) ;
    series x=year y=tenth2      / lineattrs=(color=green)  ;
    series x=year y=twelfth2    / lineattrs=(color=blue)   ;
    series x=year y=college2    / lineattrs=(color=red)    ;
    series x=year y=youngadult2 / lineattrs=(color=purple) ;
    
    xaxis label="YEAR" values=(1975 to 2019 by 2) minor;
    yaxis label="PERCENT" max=100 min=0 ;
    run ;
    
    ods html close;
    ods html;
    

    enter image description here