I have the following numeric variable in Stata:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long r_3srhlt
3
3
2
2
4
1
1
3
3
4
end
label values r_3srhlt r_3srhlt
label def r_3srhlt 1 ".", modify
label def r_3srhlt 2 "2.very ...", modify
label def r_3srhlt 3 "3.good", modify
label def r_3srhlt 4 "5.poor", modify
I would like to keep just the number and not the text.
For example I want 3, 3, 2, 2, 5, . , . , 3, 3, 5
without the "good", "very good", "poor" etc. My data was originally a Stata file that I read
via Haven
in R. After doing some manipulations on the file I imported them back to Stata.
How can I accomplish this?
You have a numeric variable, which you first need to convert to a string:
decode r_3srhlt, generate(r_3srhlt_string)
Then you can get all numbers in one go using the real()
function and a simple regular expression:
generate wanted = real(ustrregexs(0)) if ustrregexm(r_3srhlt_string, "[0-9]*")
list, separator(0) abbreviate(15)
+---------------------------------------+
| r_3srhlt r_3srhlt_string wanted |
|---------------------------------------|
1. | 3.good 3.good 3 |
2. | 3.good 3.good 3 |
3. | 2.very ... 2.very ... 2 |
4. | 2.very ... 2.very ... 2 |
5. | 5.poor 5.poor 5 |
6. | . . . |
7. | . . . |
8. | 3.good 3.good 3 |
9. | 3.good 3.good 3 |
10. | 5.poor 5.poor 5 |
+---------------------------------------+