Search code examples
stata

Splitting values in Stata and save them in a new variable


I have a numeric variable with values similar to the following system

1
2
12
21
2

I would like to split the values which have length > 1 and put the second half of the value in another variable.

So the second variable would have the values:

.
.
2
1
.

Theoretically I would just use a simple replace statement, but I am looking for a code/loop, which would recognize the double digit values and split them automatically and save them in the second variable. Because with time, there will be more observations added and I cannot do this task manually for >10k cases.


Solution

  • Here's one approach:

    clear 
    input foo 
    1
    2
    12
    21
    2
    end 
    
    generate foo1 = floor(foo/10)
    generate foo2 = mod(foo, 10)
    
    list 
    
         +-------------------+
         | foo   foo1   foo2 |
         |-------------------|
      1. |   1      0      1 |
      2. |   2      0      2 |
      3. |  12      1      2 |
      4. |  21      2      1 |
      5. |   2      0      2 |
         +-------------------+
    

    More on these functions here, here and here.

    If zeros for the first part should be missing, then

    replace foo1 = . if foo1 == 0 
    

    or (to do it in one)

    generate foo1 = floor(foo/10) if foo >= 10 
    

    The code is also good for any arguments with three digits or more.