COUNTING VALUE PER PARTCIPANTS

I would like to add a new column to a dataset but I am not sure how to do so. My dataset has a variable called KEYVAR (character variable) with three different values. A participant can appear multiple times in my dataset, with each row containing a similar or different value for KEYVAR. What I want to do is create a new variable call NEWVAR that counts how many times a participant has a specific value for KEYVAR; when a participant does not have an observation for that specific value, I want NEWVAR to have a result of zero.

Here's an example of the dataset I would like (in this example, I want to count every instance of "Y" per participants as newvar):

have

PARTICIPANT KEYVAR  
A   Y   
A   N   
B   Y   
B   Y   
B   Y   
C   W   
C   N   
C   W   
D   Y   
D   N   
D   N   
D   Y   
D   W

want

PARTICIPANT KEYVAR  NEWVAR
A   Y   1
A   N   1
B   Y   3
B   Y   3
B   Y   3
C   W   0
C   N   0
C   W   0
D   Y   2
D   N   2
D   N   2
D   Y   2
D   W   2

Solution

You can use Proc SQL to compute an aggregate result over a group meeting a criteria, and have that aggregate value automatically merged into the result set.

-OR-

Use a MEANS, TRANSPOSE, MERGE approach

Sample Code (SQL)

data have;
input ID $ value $; datalines;
A   Y
A   N
B   Y
B   Y
B   Y
C   W
C   N
C   W
D   Y
D   N
D   N
D   Y
D   W
E   X   
;

proc sql;
  create table want as
  select ID, value
  , sum(value='Y') as Y_COUNT  /* relies on logic eval 'math' 0 false, 1 true */
  , sum(value='N') as N_COUNT
  , sum(value='W') as W_COUNT    
  from have
  group by ID
  ;

Sample Code (PROC and MERGE)

* format for PRELOADFMT and COMPLETETYPES;
proc format;
  value $eachvalue
    'Y' = 'Y'
    'N' = 'N'
    'W' = 'W'
    other = '-';
  ;
run;

* Count how many per combination ID/VALUE;
proc means noprint data=have nway completetypes;
  class ID ;
  class value / preloadfmt;
  format value $eachvalue.;
  output out=freqs(keep=id value _freq_);
run;


* TRANSPOSE reshapes to wide (across) data layout, one row per ID;
proc transpose data=freqs suffix=_count out=counts_across(drop=_name_);
  by id;
  id value;
  var _freq_;
  where put(value,$eachvalue.) ne '-';
run;

* MERGE;
data want_way_2;
  merge have counts_across;
  by id;
run;