Search code examples
statapanel-data

Restructuring variables in Stata


I'm a new user of Stata and I'm trying to understand how it executes commands. I'm facing trouble in restructuring data from its present format to a panel data format.

I'm using firm level micro-data which, for example, contain firm id, last avail year (latest year for which data was collected from that firm) and turnover (REV_LAY-0 = turnover from last avail year - 0, REV_LAY-1 = turnover from last avail year - 1 and so on).

The present data format is the following:

Present data format

The required panel format looks like this:

Required panel data format

In SAS, I would do the following in a loop:

if last_avail_yr=2016 then do;
rev_2016=rev_lay-0;
rev_2015=rev_lay-1;
rev_2014=rev_lay-2;
rev_2013=rev_lay-3;
end;

But I'm not quite sure how to do it Stata. I tried using an if statement with a forvalues loop to achieve a similar result, but it didn't work out well.

Example data can be found below:

MARK BvD_ID       LAST_AVAIL_YR REV_LAY0 REV_LAY1 REV_LAY2 REV_LAY3 REV_LAY4
437  ESA22001721  2016          27689    32097    28992    35868    36493
438  ESF23212103  2015          26786    52095    33023    29493    40368
439  ESB45426806  2012          22072    14864    12877    15330    6403
440  ESA45039294  2015          26700    23387    21104    21272    20002
441  ESB76638790  2016          27480    24303    10699    .        .

Can anyone help me with the Stata code for this problem?


Solution

  • rev_lay-0 and so on are not valid names in Stata, so I assume they would be named rev_lay_0 and so on. Given that, the following should do the trick:a

    reshape long rev_lay_, i(firm_id last_avail_yr) j(id)
    by firm_id last_avail_yr: gen yr = last_avail_yr - _n + 1
    keep firm_id last_avail_yr rev_lay_ yr
    reshape wide rev_lay_, i(firm_id last_avail_yr) j(yr)