I want to replace all strings in a column with following:
strings new_strings
ABC_MNO_S3 S1
ABC_S1 S2
ABC_S2 S3
ABC_PQR_S3 S4
XYZ_MNO_S3 S5
XYZ_S1 S6
XYZ_S2 S7
XYZ_PQR_S3 S8
So whenever any of the above 'string' appear in my column 'states' I want to replace it with 'new_string' dynamically. I tried to put these list in an array and use TRANWRD
to search through the column and replace, but not working.
My column states and the desired output looks like following:
states states_result
TR_ABC_MNO_S3_ABC_S2 TR_S1_S3
TR_ABC_S1_ABC_S2 TR_S2_S3
Segment Segment
ABC_PQR_S3 S4
TR_XYZ_MNO_S3_XYZ_S2 TR_S5_S7
Year Year
St_XYZ_S2 St_S7
Could you please help? Thanks!
Sourav:
Because you mentioned TRANWRD
I will presume the strings
values can be found embedded within a states
value. The key to effective use of TRANWRD
is to TRIM
the values when variables are used for the target and replacement arguments.
Replacement concerns:
TRANWRD
will work if there is only one target value
embedded. There is a possibility that an earlier replacement could make rise a valid replacement that was not previously obvious. Consider the following state
value:
ABC_ABC_MNO_S3
A first loop over all targets would replace ABC_MNO_S3
with S1
and yield
ABC_S1
A second loop over all targets would replace ABC_S1
with S2
and yield
S2
Tested sample:
data have;
infile cards dlm="," dsd;
length states segment year $100;
input states segment year;
datalines;
"TR_ABC_MNO_S3_ABC_S2 TR_ABC_S1_ABC_S2", "ABC_PQR_S3 TR_XYZ_MNO_S3_XYZ_S2", "St_XYZ_S2"
run;
data mappings;
length string $30 new_string $2;
input string new_string;
datalines;
ABC_MNO_S3 S1
ABC_S1 S2
ABC_S2 S3
ABC_PQR_S3 S4
XYZ_MNO_S3 S5
XYZ_S1 S6
XYZ_S2 S7
XYZ_PQR_S3 S8
run;
data want;
array maps(100,2) $50 _temporary_; * first dimension must be larger than number of mappings;
do _i_ = 1 by 1 until (lastmap);
set mappings(rename=(string=_map_from new_string=_map_to)) end=lastmap;
maps(_i_,1) = _map_from;
maps(_i_,2) = _map_to;
end;
length status $12 _result $200;
do until (lastdata);
set have end=lastdata;
array targets states segment year;
status = 'ORIGINAL';
output;
do _i_ = 1 to dim(targets);
_result = targets[_i_];
_guard = 1;
do until (_noreplacement or _guard >= 10);
_noreplacement = 1;
do _j_ = 1 to dim(maps,1) while(maps(_j_,1) ne '');
if index(_result,trim(maps(_j_,1))) then do;
* put _result ': ' maps(_j_,1) '-> ' maps(_j_,2);
_result = tranwrd(_result, trim(maps(_j_,1)), trim(maps(_j_,2)));
_noreplacement = 0;
end;
end;
end;
if (_guard > 10) then do;
put 'WARNING: Guard limit 10 reached, mappings may be cycling.' _result;
end;
targets[_i_] = _result;
end;
status = 'MAPS APPLIED';
output;
end;
stop;
drop _:;
run;