I have a column ID with both string and int values. I wanted to insert a dash between the string and int for those ID values. However, I am not sure where to start. Any advice or tip would be greatly appreciated.
I have this in PROC SQL code:
PROC SQL;
select
case when ID CONTAINTS "ABC" THEN CATX("-", SUBSTR(ID,1,3), SUBSTR(ID,3,6)
when ID CONTAINTS "AB" THEN CATX("-", SUBSTR(ID,1,2), SUBSTR(ID,2,7)
when ID CONTAINTS "ABCS" THEN CATX("-", SUBSTR(ID,1,4), SUBSTR(ID,4,6)
else ID
end as ID,
Name, Age, Gender
from A;
quit;
Example from this:
|ID |Name |Age|Gender|
|123456789|Sam |30 |M |
|232456676|Jessica|20 |F |
|ABC134475|Suzen |29 |F |
|AB1235674|Alex |26 |M |
|ABCS24563|NON |15 |F |
To this:
|ID |Name |Age|Gender|
|123456789 |Sam |30 |M |
|232456676 |Jessica|20 |F |
|ABC-134475|Suzen |29 |F |
|AB-1235674|Alex |26 |M |
|ABCS-24563|NON |15 |F |
Use the compress
function with specific modifiers to extract the character and numeric values from alphanumeric strings, then concatenate with the desired delimiter.
data want;
set have;
char = compress(id, '', "D");
num = compress(id, '', "A");
if not missing(char) and not missing(num) then id = catx('-', char, num);
else if not missing(char) and missing(num) then id = char;
else if missing(char) and not missing(num) then id = num;
else id = '';
drop char num;
run;
+------------+---------+-----+--------+
| ID | Name | Age | Gender |
+------------+---------+-----+--------+
| 123456789 | Sam | 30 | M |
| 232456676 | Jessica | 20 | F |
| ABC-134475 | Suzen | 29 | F |
| AB-1235674 | Alex | 26 | M |
| ABCS-24563 | NON | 15 | F |
+------------+---------+-----+--------+