Search code examples
exceldatabase-designstatisticsstata

How can I create a time variable with Stata or Excel?


I have a dataset that I am editing so it can be used for a time series regression since the time is not currently in a format that is usable. The format of the existing data is as follows:

     --------------------------------------------------
    | id|size |2017price|2016price|2015price|2014price| ...
      -------------------------------------------------
    | 1 | 3   | 50      | 80      |  21     |  56     | ...
     --------------------------------------------------
    | 2 | 5   | 78      | 85      |  54     |  67     | ... 
     --------------------------------------------------
    | 3 | 2   | 18      | 22      |  34     |  54     | ... 
     --------------------------------------------------
    ...
    ...
    ...

I would like to add a time variable that accounts for each year and gives the corresponding value as a price variable;

    ---------------------------
    | id  | size  |t   | price| 
     --------------------------
    | 1   | 3     |2017| 50   | 
     --------------------------
    | 1   | 3     |2016| 80   |  
     --------------------------
    | 1   | 3     |2015| 21   | 
     --------------------------
    | 1   | 3     |2014| 21   | 
     --------------------------
    | 2   | 5     |2017| 78   | 
     --------------------------
    | 2   | 5     |2016| 85   |  
     --------------------------
    | 2   | 5     |2015| 54   | 
     --------------------------
    | 2   | 5     |2014| 67   | 
     --------------------------
    | 3   | 2     |2017| 18   | 
     --------------------------
    | 3   | 2     |2016| 22   |  
     --------------------------
    | 3   | 2     |2015| 34   | 
     --------------------------
    | 3   | 2     |2014| 54   | 
     --------------------------
    ...
    ...
    ...

Is there a function in Stata or Excel that can do this automatically? I have data for 20 years with over 35,000 entries so manually editing won't work.


Solution

  • Your data example as given is not quite suitable as Stata data as variable names cannot begin with numeric characters.

    That fixed, this is an exercise for the reshape command (not function).

    clear
    input id size price2017 price2016 price2015 price2014 
     1 3 50 80 21 56 
     2 5 78 85 54 67 
     3 2 18 22 34 54 
    end 
    
    reshape long price, i(id size) j(year) 
    sort id size year 
    
    list , sepby(id)
    
         +--------------------------+
         | id   size   year   price |
         |--------------------------|
      1. |  1      3   2014      56 |
      2. |  1      3   2015      21 |
      3. |  1      3   2016      80 |
      4. |  1      3   2017      50 |
         |--------------------------|
      5. |  2      5   2014      67 |
      6. |  2      5   2015      54 |
      7. |  2      5   2016      85 |
      8. |  2      5   2017      78 |
         |--------------------------|
      9. |  3      2   2014      54 |
     10. |  3      2   2015      34 |
     11. |  3      2   2016      22 |
     12. |  3      2   2017      18 |
         +--------------------------+