I have dataset in the followin format. The first 5 variables are already exist. I need to create the "NEED_TO".
It means that after row with '1' on column MTH_SATTL and when date on START column is older then date on DATA column then fill in 1 till to the next GROUP nad GR_ROZL.
I tried use RETAIN statement, but i think i make something wrong in syntax
Data Start Group GR_ROZL MTH_SATTL Need_to
01OCT2019 . G11 2 0 0
01NOV2019 . G11 2 0 0
01DEC2019 . G11 2 0 0
01JAN2020 01JAN2020 G11 2 0 0
01FEB2020 01JAN2020 G11 2 1 1
01MAR2020 01JAN2020 G11 2 0 1
01APR2020 01JAN2020 G11 2 0 1
01OCT2019 . G11 3 0 0
01NOV2019 . G11 3 0 0
01DEC2019 . G11 3 0 0
01JAN2020 01JAN2020 G11 3 0 0
01FEB2020 01JAN2020 G11 3 0 0
01MAR2020 01JAN2020 G11 3 1 1
01APR2020 01JAN2020 G11 3 0 1
01OCT2019 . G12W 2 0 0
01NOV2019 . G12W 2 0 0
01DEC2019 . G12W 2 0 0
01JAN2020 01JAN2020 G12W 2 0 0
01FEB2020 01JAN2020 G12W 2 1 1
01MAR2020 01JAN2020 G12W 2 0 1
01APR2020 01JAN2020 G12W 2 0 1
The rule "fill in 1 till to the next GROUP and GR_ROZL" means you will want to have by processing. If the BY
groups are contiguous but not collated you will need the NOTSORTED
option. You will also need a RETAIN
ed variable that gets reset at the start of a group and conditionally assigned according to your logic.
Example:
data want;
set have;
by group gr_rozl NOTSORTED;
retain need_to 0;
* reset at the start of group;
if first.gr_rozl then need_to = 0;
* assign need_to if it is unassigned and meets assignment criteria;
* this ensures the assigned value is carried forward (via retain) until reset again;
if not need_to and start < data then need_to = 1;
* alternate way to assign using logic operator OR;
* need_to = need_to OR (start < data);
run;
To set the need_to
value on the subsequent rows after first occurrence of start < data
you can use an additional retained variable as a logic flag.
Example:
data have; input
Date Start Group $ GR_ROZL MTH_SATTL;
attrib date start informat=date9. format=date9.;datalines;
01OCT2019 . G11 2 0
01NOV2019 . G11 2 0
01DEC2019 . G11 2 0
01JAN2020 01JAN2020 G11 2 0
01FEB2020 01JAN2020 G11 2 1
01MAR2020 01JAN2020 G11 2 0
01APR2020 01JAN2020 G11 2 0
01OCT2019 . G11 3 0
01NOV2019 . G11 3 0
01DEC2019 . G11 3 0
01JAN2020 01JAN2020 G11 3 0
01FEB2020 01JAN2020 G11 3 0
01MAR2020 01JAN2020 G11 3 1
01APR2020 01JAN2020 G11 3 0
01OCT2019 . G12W 2 0
01NOV2019 . G12W 2 0
01DEC2019 . G12W 2 0
01JAN2020 01JAN2020 G12W 2 0
01FEB2020 01JAN2020 G12W 2 1
01MAR2020 01JAN2020 G12W 2 0
01APR2020 01JAN2020 G12W 2 0
run;
data want;
set have;
by group gr_rozl NOTSORTED;
retain need_to need_to_flag 0;
* reset at the start of group;
if first.gr_rozl then do; need_to = 0; need_to_flag = 0; end;
* set retained value for fill in;
if not need_to and need_to_flag then need_to = 1;
* set flag for setting retained value on subsequent rows;
if mth_sattl and not need_to_flag and start < date then need_to_flag = 1;
drop need_to_flag;
run;