Search code examples
panelstata

How to transform wide type to long type in Stata when the data set is big


I have the following panel data set:

ID SEX    age_t1 age_t2 C1_Q1 C1_Q2 ...  C2_Q1 C2_Q2 ... C3_Q1 .... C4_Q1 
1  female   17    20      1    2           3     5        2           4 

C1 is year 2014 and C4 is 2017.
The problem is that the number of questions are different by year. I speculate that using stack by designating the number of columns might work. For example, from column 5 to 10 becomes the first group, and column 6 to 12 becomes the second group. But I don't know how to do it.


Solution

  • If the number of questions differs by year, then the logic of your suggestion is hard to follow.

    The Stata tag wiki gives advice on how to present data examples, which typically need to be simplified but should be as realistic as possible, rather than schematic.

    This may help:

    clear 
    input ID SEX    age_t1 age_t2 C1_Q1 C1_Q2 C2_Q1 C2_Q2 C3_Q1 C3_Q2 
    1  1   17    20      1    2     3     5   2     4 
    end 
    
    rename (C*_Q*) (Q*_C*)
    unab stubs : *_C1
    local stubs : subinstr local stubs "_C1" "", all
    reshape long `stubs', i(ID SEX age*) j(year) string
    destring year, ignore(_C) replace
    replace year = year + 2013
    
    list 
    
         +--------------------------------------------------+
         | ID   SEX   age_t1   age_t2   year   Q1   Q2   Q3 |
         |--------------------------------------------------|
      1. |  1     1       17       20   2014    1    3    2 |
      2. |  1     1       17       20   2015    2    5    4 |
         +--------------------------------------------------+
    

    How to get the stubs is documented in this FAQ.