Search code examples
stata

Using local in a forvalues loop reports a syntax error


I am using two-level loops to create a set of variables. But Stata reports a syntax error.

    forvalues i = 1/5 {
        local to `i'+1
        dis `to'

        forvalues j = `to'/6{
            dis `j'
            gen e_`i'_`j' = .
        }

    }

I could not figure out where I made the syntax error.

And a follow-up question. I would like to change how the number of loops are coded in the example above. Right now, it's hard-coded as 5 and 6. But I want to make it based on the data. For instance,I am coding as below:

        sum x
        scalar x_max_1 = `r(max)'-1
        scalar x_max_2 = `r(max)'

        forvalues i = 1/x_max_1 {
        local to = `i'+1
        dis `to'

        forvalues j = `to'/x_max_2{
            dis `j'
            gen e_`i'_`j' = .
        }

    }

However, Stata reports a syntax error in this case. I am not sure why. The scalar is a numeric variable. Why would the code above not work?


Solution

  • Your code would be better as

    forvalues i = 1/5 {
        local to = `i' + 1
        forvalues j = `to'/6 {
            gen e_`i'_`j' = .
        }
    }
    

    With your code you went

    local to `i' + 1 
    

    so first time around the loop to becomes the string or text 1 + 1 which is then illegal as an argument to forvalues. That is, a local definition without an = sign will result in copying of text, not evaluation of the expression.

    The way you used display could not show you this error because display used that way will evaluate expressions to the extent possible. If you had insisted that the macro was a string with

    di "`to'" 
    

    then you would have seen its contents.

    Another way to do it is

    forvalues i = 1/5 {
        forvalues j = `= `i' + 1'/6 {
            gen e_`i'_`j' = .
        }
    }
    

    EDIT

    You asked further about

    sum x
    scalar x_max_1 = `r(max)'-1
    scalar x_max_2 = `r(max)'
    
    forvalues i = 1/x_max_1 {
    

    and quite a lot can be said about that. Let's work backwards from one of various better solutions:

    sum x, meanonly 
    forvalues i = 1/`= r(max) - 1' {
    

    or another, perhaps a little more transparent:

    sum x, meanonly 
    local max = r(max) - 1 
    
    forvalues i = 1/`max' { 
    

    What are the messages here:

    1. If you only want the maximum, specify meanonly. Agreed: the option name alone does not imply this. See https://www.stata-journal.com/sjpdf.html?articlenum=st0135 for more.

    2. What is the point of pushing the r-class result r(max) into a scalar? You already have what you need in r(max). Educate yourself out of this with the following analogy.

    I have what I want. Now I put it into a box. Now I take it out of the box. Now I have what I want again. Come to think of it, the box business can be cut.

    The box is the scalar, two scalars in this case.

    1. forvalues won't evaluate scalars to give you the number you want. That will happen in many languages, but not here.

    2. More subtly, forvalues doesn't even evaluate local references or similar constructs. What happens is that Stata's generic syntax parser does that for you before what you typed is passed to forvalues.