Search code examples
sas

Subset records based on the occurrence of one observation but also additional ones


suppose to have the following:

  ID       Rep    Year     
 0001      exA    2015
 0001      exA    2015
 0002      exA    2015
 0002      exB    2015
 0002      exA    2015
 0002      exC    2015 
 0003      exC    2016
 0003      exA    2016
 0003      exD    2016
 0003      exA    2016 
.....     ...    ....

is there a way to subset IDs based on the appearance of exA in Rep column but also other ex*? This, only for 2015. ID 0001 that has only exA should not be extracted. The criteria is the presence of exA (mandatory) but also other ex*

Desired output:

  ID       Rep    Year           
 0002      exA    2015       
 0002      exB    2015
 0002      exA    2015
 0002      exC    2015 

Solution

  • Try this

    data have;
    input ID $ Rep $ Year;
    datalines;
    0001 exA 2015 
    0001 exA 2015 
    0002 exA 2015 
    0002 exB 2015 
    0002 exA 2015 
    0002 exC 2015 
    0003 exC 2016 
    0003 exA 2016 
    0003 exD 2016 
    0003 exA 2016 
    ;
    
    proc sql;
       create table want as
       select * from have
       where year = 2015
       group by ID
       having sum(Rep = 'exA') > 0
          and count(distinct Rep) > 1
       ;
    quit;