Search code examples
arrayssas

New to using SAS arrays and can't get array elements to populate with values based on IF-THEN-ELSE statement


Admittedly, I've never written an array, so I'm sure this is an easy solution for most users. I've got a dataset with two columns:

START_BLOCK (possible integer values 0-47) END_BLOCK (possible integer values 0-47)

I'm trying to create an array that adds fields TIME_BLOCK_0 through TIME_BLOCK_47 and inserts a 1 or 0 based on whether that time block [i] falls between (inclusive) START_BLOCK and END_BLOCK

For example, a record with START_BLOCK = 29 and END_BLOCK = 31 would have a value of 0 in every column except columns TIME_BLOCK_29, TIME_BLOCK_30, and TIME_BLOCK_31 which would be populated with a 1.

This is my first time asking for help on here so I apologize if my question isn't clear.

The code I tried (that did create the TIME_BLOCK_[i] fields but did not populate them) was:

data WANT (drop = i);
set HAVE;
array time_blocks[48] time_block_0 - time_block_47;
do i = 0 to 47;

* midnight = time block 0;

* First, account for cases that end on a different day than they start (where the end_block is < the start_block)- not common;
if end_block < start_block then do;
    if end_block >= i or start_block <= i then time_block_i = 1; else time_block_i = 0;
    end;
* then, for cases where start_block < end_block (most cases that start and end on the same day) - most common;
else if start_block <= [i] and end_block >= [i] then time_block_i = 1; else time_block_i = 0;

end;

run;


Solution

  • It will be less confusing if you tell SAS to use indexes of 0 to 47 instead of the normal 1 to 48.

    Let's first make some example data so we have something to test with.

    data have;
      input start_block end_block;
    cards;
    0 47
    2 4
    -1 56
    56 -1
    ;
    

    Since SAS evaluates boolean expressions as either 0 (FALSE) or 1 (TRUE) you can skip the IF/THEN logic and just use an assignment statement. (Also don't bother using dataset options to DROP a variable. A simple DROP statement is all you need. Save the complication of dataset options for when you have to use them.)

    data WANT ;
      set HAVE;
      array time_blocks[0:47] time_block_0 - time_block_47;
      do index = 0 to 47;
        time_blocks[index] = start_block <= index <= end_block;
      end;
      drop index;
    run;
    

    enter image description here

    In comments you seem to say the actual data is time of day values for surgeries. In that case use the actual datetime values and loop between them.

    Initialize the array to zeros, then loop between the start and end by half-hour increments and then increment the appropriate variable.

    data have;
      input start :datetime. end :datetime.;
      format start end datetime19.;
    cards;
    01JAN2024:00:15 01JAN2024:04:35 
    01JAN2024:22:15 02JAN2024:03:00 
    ;
    
    data WANT ;
      set HAVE;
      array time_blocks[0:47] time_block_0 - time_block_47;
      do index = 0 to 47; time_blocks[index] = 0; end;
      do datetime=start to end by '00:30:00't ;
        time_blocks[int(timepart(datetime)/'00:30:00't)]+1;
      end;
      drop index datetime;
    run;
    

    If you only have time of day but not actual datetime values then just convert them. So use DAY=0 when converting START_TIME to a datetime value and when END_TIME is less than START_TIME then use DAY=1 when converting it to a datetime value. Example:

    start = dhms(0,0,0,start_time);
    end = dhms(end_time<start_time,0,0,end_time);
    format start_time end_time tod5. start end datetime19.;