Search code examples
if-statementstata

Stata: Using if with value labels


I faced an issue using if with value labels.

set obs 5
gen var1 = _n
label define l_var1 1 "cat1" 2 "cat1" 3 "cat2" 4 "cat3" 5 "cat3"
label val var1 l_var1
keep if var1=="cat3":l_var1
(4 observations deleted)

I expected 3 records to be deleted. How can I achieve this?

I am using Stata 16.1.


Solution

  • "cat3":l_var1 does not look up all values in l_var1 that corresponds to "cat3". It returns the first value that corresponds to the string "cat3".

    So "cat3":l_var1 evaluates to 4 so keep if var1=="cat3":l_var1 evaluates to keep if var1==4 and therefore only one observation is kept.

    See code below that shows this behavior. This is not the way you seem to want "cat3":l_var1 to behave, but this is how it behaves.

    set obs 5
    gen var1 = _n
    label define l_var1 1 "cat1" 2 "cat1" 3 "cat2"  5 "cat3" 4 "cat3"
    label val var1 l_var1
    gen var2 = "cat3":l_var1
    gen var3 = 1 if var1=="cat3":l_var1
    

    This answers what is going on in your code. The code below is a better way to solve what you are trying to do.

    set obs 5
    gen var1 = _n
    label define l_var1 1 "cat1" 2 "cat1" 3 "cat2"  5 "cat3" 4 "cat3"
    label val var1 l_var1
    
    decode var1, generate(var_str)
    keep if var_str == "cat3"