The below code doesn't seem to be working for the variable all_s when there is more than 1 record with the same urn. Var1,2,3 work fine but that one doesn't and I cant figure out why. I am trying to have all_s equal to single_var1,2,3 concatenated with no spaces if it's first.urn but I want it to be
all_s = all_s + ',' + single_var1 + single_var2 + single_var3
when it's not the first instance of that urn.
data dataset_2;
set dataset_1;
by URN;
retain count var1 var2 var3 all_s;
format var1 $40. var2 $40. var3 $40. all_s $50.;
if first.urn then do;
count=0;
var1 = ' ';
var2 = ' ';
var3 = ' ';
all_s = ' ';
end;
var1 = catx(',',var1,single_var1);
var2 = catx(',',var2,single_var2);
var3 = catx(',',var3,single_var3);
all_s = cat(all_s,',',single_var1,single_var2,single_var3);
count = count+1;
if first.urn then do;
all_s = cat(single_var1,single_var2,single_var3);
end;
run;
all_s
is not large enough to contain the concatenation if the total length of the var1-var3
values within the group exceeds $50
. Such a scenario seems likely with var1-var3
being $40
.
I recommend using the length
function to specify variable lengths. format
will create a variable of a certain length as a side effect.
catx
removes blank arguments from the concatenation, so if you want spaces in the concatenation when you have blank single_varN
you won't be able to use catx
A requirement that specifies a concatenation such that non-blank values are stripped and blank values are a single blank will likely have to fall back to the old school trim(left(…
approach
Sample code
data have;
length group 8 v1-v3 $5;
input group (v1-v3) (&);
datalines;
1 111 222 333
1 . 444 555
1 . . 666
1 . . .
1 777 888 999
2 . . .
2 . b c
2 x . z
run;
data want(keep=group vlist: all_list);
length group 8 vlist1-vlist3 $40 all_list $50;
length comma1-comma3 comma $2;
do until (last.group);
set have;
by group;
vlist1 = trim(vlist1)||trim(comma1)||trim(left(v1));
vlist2 = trim(vlist2)||trim(comma2)||trim(left(v2));
vlist3 = trim(vlist3)||trim(comma3)||trim(left(v3));
comma1 = ifc(missing(v1), ' ,', ',');
comma2 = ifc(missing(v2), ' ,', ',');
comma3 = ifc(missing(v3), ' ,', ',');
all_list =
trim(all_list)
|| trim(comma)
|| trim(left(v1))
|| ','
|| trim(left(v2))
|| ','
|| trim(left(v3))
;
comma = ifc(missing(v3),' ,',',');
end;
run;
Reference
SAS has operators and multiple functions for string concatenation
||
concatenatecat
concatenatecatt
concatenate, trimming (remove trailing spaces) of each argumentcats
concatenate, stripping (remove leading and trailing spaces) of each argumentcatx
concatenate, stripping each argument and delimitingcatq
concatenate with delimiter and quote arguments containing the delimiterComparisons
The results of the CAT, CATS, CATT, and CATX functions are usually equivalent to results that are produced by certain combinations of the concatenation operator (||) and the TRIM and LEFT functions. However, the default length for the CAT, CATS, CATT, and CATX functions is different from the length that is obtained when you use the concatenation operator. For more information, see Length of Returned Variable.
Note: In the case of variables that have missing values, the concatenation produces different results. See Concatenating Strings That Have Missing Values.