Search code examples
sqlsasconditional-statementssubtractionproc

Conditional Subtraction within the same column using SAS


here is a portion of a data set I have named “antibody” :

Row          Subject        Type               Procedure           Measurement              Output
1               500         Intial              Invasive             20                        20
2               500          Initial            Surface              35                        35
3               500          Followup          Invasive             54                        54-20
4               428          Followup          Outer                 29                        29-10
5               765          Seventh           Other                 13                       13-19
6                500          Followup          Surface              98                       98-35
7                428          Initial             Outer            10                        10                
8                765          Initial              Other            19                        19     
9                610          Third              Invasive            66                       66-17
10               610          Initial             Invasive            17                       17

I was trying to use proc sql to perform this. The goal is to subtract the numbers in the "MEASUREMENT" column based on the "SUBJECT", "TYPE" and “PROCEDURE” columns. If two values in the “SUBJECT” column match and two values in the “PROCEDURE” column match, then the initial measurement should be subtracted from the other measurement. For example, the initial measurement in row 1 (20) should be subtracted from the followup measurement in row 3 (54) because the subject (500) and procedure (Invasive) match. Furthermore, the initial measurement in row 8 (19) should be subtracted from the seventh measurement in row 5 (13) because the subject (765) and procedure (Other) match. The result should form the "OUTPUT" column.

Thank you in advance!


Solution

  • Here is a hash object approach

    data have;
    input Subject Type $ 5-12 Procedure $ 15-22 Measurement;
    datalines;
    500 Initial   Invasive  20 
    500 Initial   Surface   35 
    500 Followup  Invasive  54 
    428 Followup  Outer     29 
    765 Seventh   Other     13 
    500 Followup  Surface   98 
    428 Initial   Outer     10 
    765 Initial   Other     19 
    610 Third     Invasive  66 
    610 Initial   Invasive  17 
    ;
    
    data want (drop=rc _Measurement);
       if _N_ = 1 then do;
          declare hash h (dataset : "have (rename=(Measurement=_Measurement) where=(Type='Initial'))");
          h.definekey ('Subject');
          h.definedata ('_Measurement');
          h.definedone();
       end;
    
       set have;
       _Measurement=.;
    
       if Type ne 'Initial' then rc = h.find();
       Measurement = sum (Measurement, -_Measurement);
    run;
    

    EDIT:

    data want (drop=rc _Measurement);
       if _N_ = 1 then do;
          declare hash h (dataset : "have (rename=(Measurement=_Measurement) where=(Type='Initial'))");
          h.definekey ('Subject');
          h.definedata ('_Measurement');
          h.definedone();
       end;
    
       set have;
       _Measurement=.;
    
       if Type ne 'Initial' then rc = h.find();
       NewMeasurement = ifn(Measurement=., ., sum (Measurement, -_Measurement));
    run;