I need some help with Stata data transformation.
I have a survey, where the user can answer with "no response" which has been coded to integer 98. The variables can be of different data types. I need to get the number of "no response"/98 by a user into a separate variable.
I attached the dataset sample:
UserN Q1 Q2 Q3 Q4 Q5 Q6 NewCreatedColumn
User1 11 "male" "12:55pm" 98 "Answer1" "other" 1
User2 98 "female" "1:00am" 98 "AnswerX" "Batman" 2
User3 16 "male" "1:00am" 34 "other" "superman" 0
User4 98 "female" "1:00am" 98 "other" "Dog" 2
User5 66 "male" "1:00am" 98 "Life" "Cat" 1
This would have been fairly easy in python, with each user in the dataframe is a list and you can scan for integer 98 in the list.
Is there an equivalent in Stata?
Thanks for the data example, improved below to become reproducible code. See also help dataex
within Stata (or search dataex
in an ancient Stata).
clear
input str5 UserN Q1 str7 (Q2 Q3) Q4 str8 (Q5 Q6) NewCreatedColumn
User1 11 "male" "12:55pm" 98 "Answer1" "other" 1
User2 98 "female" "1:00am" 98 "AnswerX" "Batman" 2
User3 16 "male" "1:00am" 34 "other" "superman" 0
User4 98 "female" "1:00am" 98 "other" "Dog" 2
User5 66 "male" "1:00am" 98 "Life" "Cat" 1
end
ds Q* , has(type numeric)
egen wanted = anycount(`r(varlist)'), values(98)
For counting the string foo
, a loop will do it
ds Q*, has(type string)
gen WANTED = 0
quietly foreach v in `r(varlist)' {
replace WANTED = WANTED + (`v' == "foo")
}