I am doing data clean-up in Stata and I need to recode a variable to equal 1
if a whole set of other variables are equal to 1
, 6
, or 7
.
I can do this using the code below:
replace anyadl = 1 if diffdress==1 | diffdress==6 | diffdress==7 | ///
diffwalk==1 | diffwalk==6 | diffwalk==7 | ///
diffbath==1 | diffbath==6 | diffbath==7 | ///
diffeat==1 | diffeat==6 | diffeat==7 | ///
diffbed==1 | diffbed==6 | diffbed==7 | ///
difftoi==1 | difftoi==6 | difftoi==7
However, this is very inefficient to type out and it is easy to make errors.
Is there a simpler way to do this?
For example, something along the following lines:
replace anyadl = 1 if diff* == (1 | 6 | 7)
Your fantasy syntax wouldn't do what you want even if it were legal, as for example 1|6|7
would be evaluated as 1. That is, in Stata 1 OR 6 OR 7 is in effect true OR true OR true, so true, and thus 1, given the rules non-zero is true as input and true is 1 as output. The expression is 1|6|7
is legal; it's the wildcard in an equality or inequality that isn't.
Stepping back, your code is producing an indicator (some people say dummy) variable with values 1 or missing. In practice such a variable is much more useful if created with values 0 and 1 (and in some instances missing too).
generate anyad1 = 0
foreach v in dress walk bath eat bed toi {
replace anyad1 = 1 if inlist(diff`v', 1, 6, 7)
}
is one approach. In general, note both inlist(foo, 1, 6, 7)
and inlist(1, foo, bar, bazz)
as useful constructs.
Reading:
This paper on generating indicators