Search code examples
syntax-errorstatareshapestubs

Error in reshape long multiple variables


I have to reshape my dataset from wide to long. I have 500 variables that range from 2016 to 2007 and are recorded as abcd2016 and so on. I needed a procedure that allowed me to reshape without writing all the variables' names and I run:

unab vars : *2016 
local stubs16 : subinstr local vars "2016" "", all
unab vars : *2015 
local stubs15 : subinstr local vars "2015" "", all

and so on, then:

reshape long `stubs16' `stubs15' `stubs14' `stubs13' `stubs12' `stubs11' `stubs10' `stubs09' `stubs08' `stubs07', i(id) j(year)

but I get the error

invalid syntax
r(198);

Why? Can you help me to fix it?


Solution

  • The idea is to just specify the stub when reshaping to long format. To that end, you need to remove the year part from the variable name and store unique stubs in a local that you can pass to reshape:

    /* (1) Fake Data */
    clear
    set obs 100
    gen id = _n
    foreach s in stub stump head {
        forvalues t = 2008(1)2018 {
            gen `s'`t' = rnormal()
        }
    }
    
    /* (2) Get a list of stubs and reshape */
    /* Get a list of variables that contain 20, which is stored in r(varlist) */
    ds *20*
    /* remove the year part */
    local prefixes = ustrregexra("`r(varlist)'","20[0-9][0-9]","")
    /* remove duplicates from list */
    local prefixes: list uniq prefixes 
    reshape long `prefixes', i(id) j(t)
    

    This will store the numeric suffix in a variable called t.