Search code examples
statapanel-data

Inactivity duration variable in panel data (Stata)


I have a dataset for U.S. manufacturing workers in the past 30 decades, and I am particularly interested in the following variables:

  1. Month and year of 1st manufacturing job, recorded separately and named "start_month_job_1" & "start_yr_job_1."
  2. Month and year of leaving the 1st manufacturing job, recorded separately and named "end_month_job_1" & "end_yr_job_1."
  3. The reason for leaving the job (e.g. retirement, firing, factory shutdown, etc.), named "leaving_reason"
  4. Month and year of 2nd manufacturing job, recorded separately and named "start_month_job_2" & "start_yr_job_2."
  5. Month and year of leaving the 2nd manufacturing job, recorded separately and named "end_month_job_2" & "end_yr_job_2."

I am trying to create a variable that measures the duration of economic inactivity/idleness. I am defining "duration of economic inactivity" this as the time difference between leaving a 1st job and starting another job. I have created a variable that accomplishes that with years as in below:

gen econ_inactivity_duration_1 = start_yr_job_2 - end_yr_job_1 
replace econ_inactivity_1 = 2018 - end_yr_job_1 if missing(start_yr_job_2 ) /// In cases where a worker never starts a second job until 2018, which is the latest year measured in the survey.

However, I want to actually create an economic_inactivity_duration variable that takes into account the difference in month and year, for both starting and leaving a job, respectively. For instance, the duration for the worker in row 1 would be 2 months, between May, 1993 and July, 1993, as opposed to zero, which is what my code above computes.

dataex start_month_job_1 byte start_yr_job_1 byte end_month_job_1 byte end_yr_job_1 byte start_month_job_2 byte start_yr_job_2 byte end_month_job_2 byte end_yr_job_2 byte leaving_reason

 3 1990  5 1993  7 1993  4 1994 "Firm shutdown"
 1 2003  7 2015  .    .  .    . "job automation"
98 1979 98 2004  .    .  .    . "Firm shutdown"
98 1975 98 2010 98 2010 98 2015 "job automation"
 1 1983 12 1985  1 1986  .    . "Firm shutdown"
98 1996 98 1998  .    .  .    . "Firm shutdown"

Solution

  • There is probably a better way, but here is a crude method.

    * Data example
    input end_month_job_1 end_yr_job_1 start_month_job_2 start_yr_job_2
    5 1993 7 1993
    end
    
    * Calculate months since 1960
    gen j1_end = (end_yr_job_1 - 1960) * 12 + end_month_job_1
    gen j2_start = (start_yr_job_2 - 1960) * 12 + start_month_job_2
    
    * Calculate difference
    gen wanted = j2_start - j1_end
    
    * Check difference is positive
    assert wanted > 0
    
    list
    
         +------------------------------------------------------------------------+
         | end_mo~1   end_yr~1   s~mont~2   s~yr_j~2   j1_end   j2_start   wanted |
         |------------------------------------------------------------------------|
      1. |        5       1993          7       1993      401        403        2 |
         +------------------------------------------------------------------------+