Search code examples
statapanel-data

Reshape long panel data to wide where data is not unique within ID


I have a dataset that looks like this:

| State | Year | Industry | Employment |
|-------|------|----------|------------|
| AL    | 2014 | 1        | 123345     |
| AL    | 2015 | 1        | 145411     |
| AL    | 2016 | 1        | 149402     |
| AL    | 2014 | 2        | 153518     |
| AL    | 2015 | 2        | 157773     |
| AL    | 2016 | 2        | 163156     |
| AK    | 2014 | 1        | 167187     |
| AK    | 2015 | 1        | 167863     |
| AK    | 2016 | 1        | 163320     |
| AK    | 2014 | 2        | 162419     |
| AK    | 2015 | 2        | 166116     |
| AK    | 2016 | 2        | 170136     |

I would like to end up with a dataset that looks as follows:

| State | Year | Employment_Industry1 | Employment_Industry2 |
|-------|------|----------------------|----------------------|
| AL    | 2014 | 123345               | 153518               |
| AL    | 2015 | 145411               | 157773               |
| AL    | 2016 | 149402               | 163156               |
| AK    | 2014 | 167187               | 162419               |
| AK    | 2015 | 167863               | 166116               |
| AK    | 2016 | 163320               | 170136               |

As you can see, the data I have is in long format but the years are repeated within a State by Industry. This is causing an issue when I reshape wide.

I generated IDs for a couple of different variable groupings, but I end up with an error to the effect of:

values of variable Industry not unique within ID

What kind of an ID do I need to create, or is there something I can do to create the desired dataset?


Solution

  • The following works for me:

    clear
    input str2 State Year Industry Employment
    AL 2014 1 123345    
    AL 2015 1 145411     
    AL 2016 1 149402     
    AL 2014 2 153518     
    AL 2015 2 157773     
    AL 2016 2 163156     
    AK 2014 1 167187     
    AK 2015 1 167863     
    AK 2016 1 163320     
    AK 2014 2 162419     
    AK 2015 2 166116     
    AK 2016 2 170136   
    end  
    
    egen id = group(State)
    reshape wide Employment, i(id Year) j(Industry)
    
    drop id
    order State Year Employment*
    
    list, abbreviate(15) sepby(State)
    
         +------------------------------------------+
         | State   Year   Employment1   Employment2 |
         |------------------------------------------|
      1. |    AK   2014        167187        162419 |
      2. |    AK   2015        167863        166116 |
      3. |    AK   2016        163320        170136 |
         |------------------------------------------|
      4. |    AL   2014        123345        153518 |
      5. |    AL   2015        145411        157773 |
      6. |    AL   2016        149402        163156 |
         +------------------------------------------+