I have a string variable and some of the responses have an extra character at the beginning. The character in question is a constant character in all cases. The variable is ICD-code. For example, instead of G23 I have DG23.
Is there a way in Stata to remove the excess D character?
My data looks like this
ID | diag |
---|---|
1 | DZ456 |
2 | DG32 |
3 | DY258 |
4 | DD35 |
5 | DS321 |
6 | DD21 |
7 | DA123 |
For basic information in this territory, consult help string functions
.
* Example generated by -dataex-. To install: ssc install dataex
clear
input byte d str5 diag
1 "DZ456"
2 "DG32"
3 "DY258"
4 "DD35"
5 "DS321"
6 "DD21"
7 "DA123"
end
replace diag = substr(diag, 2, .) if substr(diag, 1, 1) == "D"
list
+----------+
| d diag |
|----------|
1. | 1 Z456 |
2. | 2 G32 |
3. | 3 Y258 |
4. | 4 D35 |
5. | 5 S321 |
|----------|
6. | 6 D21 |
7. | 7 A123 |
+----------+