Search code examples
if-statementstatarecode

Let a variable equal multiple values in an if-statement


I am doing data clean-up in Stata and I need to recode a variable to equal 1 if a whole set of other variables are equal to 1, 6, or 7.

I can do this using the code below:

replace anyadl = 1 if diffdress==1 | diffdress==6 | diffdress==7 | ///
                      diffwalk==1  | diffwalk==6  | diffwalk==7  | ///
                      diffbath==1  | diffbath==6  | diffbath==7  | ///
                      diffeat==1   | diffeat==6   | diffeat==7   | ///
                      diffbed==1   | diffbed==6   | diffbed==7   | /// 
                      difftoi==1   | difftoi==6   | difftoi==7

However, this is very inefficient to type out and it is easy to make errors.

Is there a simpler way to do this?

For example, something along the following lines:

replace anyadl = 1 if diff* == (1 | 6 | 7)

Solution

  • Your fantasy syntax wouldn't do what you want even if it were legal, as for example 1|6|7 would be evaluated as 1. That is, in Stata 1 OR 6 OR 7 is in effect true OR true OR true, so true, and thus 1, given the rules non-zero is true as input and true is 1 as output. The expression is 1|6|7 is legal; it's the wildcard in an equality or inequality that isn't.

    Stepping back, your code is producing an indicator (some people say dummy) variable with values 1 or missing. In practice such a variable is much more useful if created with values 0 and 1 (and in some instances missing too).

    generate anyad1 = 0 
    
    foreach v in dress walk bath eat bed toi { 
        replace anyad1 = 1 if inlist(diff`v', 1, 6, 7) 
    } 
    

    is one approach. In general, note both inlist(foo, 1, 6, 7) and inlist(1, foo, bar, bazz) as useful constructs.

    Reading:

    This paper on generating indicators

    This one on useful functions

    This one on inlist() and inrange()

    FAQ on true and false in Stata