Search code examples
statastata-macros

Track how many new variables have been created by splitting a string


I'm using the split command to split a variable which has multiple strings separated by a semicolon. I would also like to keep track of how many new string variables have been created as a consequence of splitting the original string variable and store it in a local macro.

So for example, if my initial data is something like:

State

PA;CA
MA
WA;CA;OR

And I use split State, p(;):

State     State1 State2 State3

PA;CA      PA     CA
MA         MA
WA;CA;OR   WA     CA     OR

I would like to be able to find that it has created 3 new variables and store that value in a local macro.

Is there a way to do this?


Solution

  • Many Stata commands will leave behind useful objects and split is no exception.

    From the help file (which you can find by typing help split):

    Stored results
    
        split stores the following in r():
    
        Scalars   
          r(nvars)       number of new variables created
          r(varlist)     names of newly created variables
    

    These could be used like this:

    clear
    
    input str11 states
    "PA;CA"
    "MA"
    "WA;CA;OR;PA"
    end
    
    compress
    split states, parse(;) gen(S)
    
    display `: word count `r(varlist)''
    display r(nvars)
    

    The second is probably easier.