Search code examples
statareshapedata-management

Generate new variable from observations in multiple variables


I have an example dataset with the variables date, spx (market price), trans_1 (stock 1 price) and trans_2 (stock 2 price):

* Example generated by -dataex-. To install: ssc install dataex
clear
input int(date spx) double(trans_1 trans_2)
14610 100     1    1
14611 102   1.1   .9
14612 103  1.21  .81
14613 104 1.321 .729
end
format %tdnn/dd/CCYY date

I need to rearrange the data so that I derive three variables: trans_id, date and trans_price where every date in the dataset exists for every stock id and all stock prices are combined in a stock_price variable.

Expected outcome:

   date   spx   trans   trans_id

1/1/2000   100         1         1 
1/2/2000   102       1.1         1 
1/3/2000   103      1.21         1 
1/4/2000   104     1.321         1 
1/1/2000   100         1         2
1/2/2000   102        .9         2
1/3/2000   103       .81         2
1/4/2000   104      .729         2

I have read through various forums trying to make use of the xpose, merge or append commands, but could not come up with a possible solution.


Solution

  • The following works for me:

    generate id = _n
    reshape long trans_ , i(id) j(_j)
    
    drop id _j
    bysort date (spx): generate id = _n
    
    sort id (date)
    
    list, sepby(id)
    
         +------------------------------+
         |     date   spx   trans_   id |
         |------------------------------|
      1. | 1/1/2000   100        1    1 |
      2. | 1/2/2000   102      1.1    1 |
      3. | 1/3/2000   103     1.21    1 |
      4. | 1/4/2000   104    1.321    1 |
         |------------------------------|
      5. | 1/1/2000   100        1    2 |
      6. | 1/2/2000   102       .9    2 |
      7. | 1/3/2000   103      .81    2 |
      8. | 1/4/2000   104     .729    2 |
         +------------------------------+