Search code examples
sqltimesasproc-sqlenterprise-guide

How to create flag 0/1 inform whether was changed or not during 4 months between two columns in PROC SQL in SAS Enterprise Guide?


I have table in SAS Enterprise Guide like below:

name of table: table123

date typ:

  • ID - numeric
  • DT - numeric
  • VAL1 - string
  • VAL2 - string
ID DT VAL1 VAL2
123 20210523 PR P
123 20211222 P P
222 20210502 PR M
222 20210711 M PR
38 20210513 PR PR
991 20211123 M P

And I need to create column "COL1" with values 0/1:

  • If ID changed value from column "VAL1" from PR to value P or M in column "VAL2" and has not changed back to PR for 4 months or more -> 1 else 0

SO I need something like below:

ID DT VAL1 VAL2 COL1
123 20210523 PR P 1
123 20211222 P P 1
222 20210502 PR M 0
222 20210711 M PR 0
38 20210513 PR PR 0
991 20211123 M P 0

Because:

  • 123 - has 1 in "COL1" because changed PR to P and does not changed back to PR for 4 month or more
  • 222 - has 0 in "COL1" because changed PR to M but did not last like this for at least 4 months (after 2 months backed to PR)
  • 38 - has 0 in "COL1" because did not change PR to P or M
  • 991 - has 0 in "COL1" because did not change PR to P or M

How can I do that in SAS Enterprise Guide in PROC SQL ?


Solution

  • You'll want to use a data step for this one since it uses fairly complex logic. You must first convert your date into a SAS date for this to work.

    data col1_flag;
        set have;
        by id dt;
    
        /* Do not reset these variables to missing at each row */
        retain flag_pr_to_p_pm dt_from_pr_to_p_m;
        
        /* Reset flags and dates at each new ID */
        if(first.id) then call missing(flag_pr_to_p_pm, dt_from_pr_to_p_m);
    
        /* If PR (VAL1) changes to P or M (VAL2), then set a flag. If VAL1
           changes back from P or M, reset the flag to 0*/
        if(VAL1 = 'PR' and VAL2 IN('P', 'M') ) then do;
            flag_pr_to_p_pm = 1;
    
            /* Store the number of months when it originally went from PR --> P or M */
            dt_from_pr_to_p_m = dt;     
        end;
            else if(VAL1 NOT IN ('P', 'M') ) then flag_pr_to_p_pm = 0;
        
        /* 1. We've encountered a situation where PR --> P or M
           2. It's been 4 months since PR --> P or M
           3. P or M has not changed back to PR
           4. OR: There's one row that goes from PR --> P or M
        */
        COL1 = (    flag_pr_to_p_pm
                AND intck('month', dt_from_pr_to_p_m, dt) > 4 
                AND NOT (VAL1 IN('P', 'M') AND VAL2 = 'PR')
                OR (flag_pr_to_p_pm AND first.id AND last.id)
               )
        ;
    
        if(COL1) then output;
    
        format dt_from_pr_to_p_m date9.;
    
        keep id;
    run;
    

    Output:

    ID  DT          VAL1    VAL2    COL1
    38  13MAY2021   PR      PR      0
    123 23MAY2021   PR      P       1
    123 22DEC2021   P       P       1
    222 02MAY2021   PR      M       0
    222 11JUL2021   M       PR      0
    991 23NOV2021   M       P       0