Search code examples
for-loopforeachnested-loopsstatanested-for-loop

Recoding multiple variables with loop in Stata


I have 13 dummy variables that share a common name like so (haven't included all 13 for space reasons)

describe Q23*

Variable      Storage   Display    Value
    name         type    format    label      Variable label
Q23Agricultur~k str5    %9s                   Q23: Agriculture 
Q23Miningandq~g str5    %9s                   Q23: Mining
Q23Trade        str5    %9s                   Q23: Trade
Q23Teaching     str5    %9s                   Q23: Teaching
Q23Healthrela~k str5    %9s                   Q23: Health related work
Q23Transport    str5    %9s                   Q23: Transport
Q23Repairing    str5    %9s                   Q23: Repairing
Q23Construction str5    %9s                   Q23: Construction
Q23Manufactur~g str5    %9s                   Q23: Manufacturing
Q23Domesticwo~e str5    %9s                   Q23: Domestic work 

They're all dummy variables taking on TRUE/FALSE values so I want to encode them as categorical. Since they have similar names, I figured this would do the trick

foreach x of varlist Q23* {
    forvalues i = 1/13{
    encode `x', gen(sector`i')
    }
}

Yet what it does is generate 13 sector variables with names from 1 to 13, but all corresponding to agriculture, and none for the remaining variables.

describe sector*

Variable      Storage   Display    Value
    name         type    format    label      Variable label
--------------------------------------------------------------------------------------------------
sector1         long    %8.0g      sector1    Q23: Agriculture and livestock
sector2         long    %8.0g      sector2    Q23: Agriculture and livestock
sector3         long    %8.0g      sector3    Q23: Agriculture and livestock
sector4         long    %8.0g      sector4    Q23: Agriculture and livestock
sector5         long    %8.0g      sector5    Q23: Agriculture and livestock
sector6         long    %8.0g      sector6    Q23: Agriculture and livestock
sector7         long    %8.0g      sector7    Q23: Agriculture and livestock
sector8         long    %8.0g      sector8    Q23: Agriculture and livestock
sector9         long    %8.0g      sector9    Q23: Agriculture and livestock
sector10        long    %8.0g      sector10   Q23: Agriculture and livestock
sector11        long    %8.0g      sector11   Q23: Agriculture and livestock
sector12        long    %8.0g      sector12   Q23: Agriculture and livestock
sector13        long    %8.0g      sector13   Q23: Agriculture and livestock

What am I doing wrong? Why is the loop not working?

Thanks!


Solution

  • It's hard to verify what you're doing wrong here without a small snippet of your data as a reproducible example (as ThelceBear mentions in their comment), but two pointers:

    1. The language 'encoding dummy variables into categorical variables' doesn't really make sense in this instance (or any that I can think of). Dummy variables are just a special case of a categorical variable where the number of categories = 2. And if you start with 2 categories in your variable I can't see how you'd convert that into more than 2 categories without some additional data. Nevertheless, this is more of a semantic point -- I think what you mean is that you want to encode your string variables into numeric variables (based on your use of encode).

    2. You do not need the inner loop. Currently, your code is making the sector* variables 1-13 on the first iteration of the outer loop and then replacing those variables on the next, and so on, until the outer loop completes. The following code may help you resolve this:

    local i = 1
    foreach x of varlist Q23* {
      encode `x', gen(sector`i')
      local i = `i' + 1
    }