Search code examples
loopsstata

Generate new variables as functions of existing with foreach


I have searched the forum, but unfortunately cannot find the solution of my problem. I have 400 repeated measures for 30 study participants, and I want to generate the difference between the pairs of measurements for each participant. So I thought the foreach command would save me a lot of work.

The variable names have a prefix which lets me separate first and second measurement: S_me and E_me. This is followed by a 22 character long, specific code that allow me to pick the right measures:

So I want 1 new variable = the diff between S_me_XXXXXXXXXXXXXXXXXXXXX1 and E_me_XXXXXXXXXXXXXXXXXXXXX1, 1 new variable for the diff between S_me_XXXXXXXXXXXXXXXXXXXXX2 and E_me_XXXXXXXXXXXXXXXXXXXXX2 up to S_me_XXXXXXXXXXXXXXXXXXX400 and E_me_XXXXXXXXXXXXXXXXXXX400

I have now tried:

unab where : S_me*
local where " `where'" 
local where : subinstr local where " S_me" " ", all 
display "`where'" 

foreach c of local where {
    gen Diff_`c' = S_me`c'- E_me`c' 
} 

as I found a similar post here - but it does not work.

Neither does:

foreach x of varlist S_me* {
    gen Diff_`x' = (S_me`x'-E_me`x')
} 

Now I hope someone sees a great solution to my problem.


Solution

  • The point of a reshape can be illustrated with a trimmed down version. You evidently have something like

    clear 
    input id S_me_X1 E_me_X1 S_me_X2 E_me_X2 
    1    3 4 5 7    
    2    10 12 14 16 
    end 
    

    -- just different variable names and values and numbers of variables and values.

    Once you reshape the difference is just one new variable:

    reshape long S_me_ E_me_ , i(id) j(which) string 
    
    gen diff = S_me - E_me 
    
    list
    
    
         +-----------------------------------+
         | id   which   S_me_   E_me_   diff |
         |-----------------------------------|
      1. |  1      X1       3       4     -1 |
      2. |  1      X2       5       7     -2 |
      3. |  2      X1      10      12     -2 |
      4. |  2      X2      14      16     -2 |
         +-----------------------------------+
    

    Once you get into the situation of creating 400 new variables every time you do something, you are sliding to perdition. How are you going to describe, graph, or model them?