How to get non-consecutive entry and exit dates for datasets?

I have a unbalanced panel dataset from 2008 to 2018 of individuals in a labour force. Therefore, if individuals are working for a specific year, they will earn an income. It looks like this:

* Example generated by -dataex-. For more info, type help dataex
clear
input float(idno year income)
1 2008 100
1 2009 100
1 2010 100
1 2011 100
1 2012 100
1 2013 100
1 2014 100
1 2015 100
1 2016 100
1 2017 100
1 2018 100
2 2008 100
2 2009 100
2 2010 100
3 2009 100
3 2010 100
3 2015 100
3 2016 100
end

From this sample, we know that individual 1 (idno== 1) earns an income from 2008 to 2018; similarly, individual 2 (idno== 2) works during 2008-2010.

I would like to determine the year that an individual enters the labour force and leaves the labour force. Therefore, I tried the following:

I rectangularised the dataset (I use Stata 16):

fillin idno year

Then I determine whether individuals have worked in the dataset:

gen work = . 
replace work = 1 if income != .

Then I try to determine the start and end dates by individual (this would only work nicely for consecutive work periods)

bysort idno: gen years_earn_income = year if work == 1 
bysort idno: gen years_no_income = year if work == 0
bysort idno: gen start = min(years_earn_income)
bysort idno: gen end = max(years_earn_income)

I am struggling to get an appropriate entry and exit year for individuals that have multiple periods of employment. For example, individual 3 (idno== 3) works for 2009-2010 and 2015-2016. Therefore, I would want variables to reflect multiple employment periods as is the case for individual 3. I would appreciate any thoughts of how to think about this.

Solution

See https://www.stata-journal.com/article.html?article=dm0029 for a discussion of handling spells and tsspell from SSC for an implementation. Your example could be analysed like this:

 clear 
 input idno year income 
 1 2008 100
 1 2009 100
 1 2010 100
 1 2011 100
 1 2012 100
 1 2013 100
 1 2014 100
 1 2015 100
 1 2016 100
 1 2017 100
 1 2018 100
 2 2008 100
 2 2009 100
 2 2010 100
 3 2009 100
 3 2010 100
 3 2015 100
 3 2016 100
 end 
 
 tsset idno year 
 tsfill 
 
 ssc install tsspell 
 
 tsspell, pcond(income)
 
 list, sepby(idno _spell)
 
 list if _seq == 1 | _end, sepby(idno _spell)

Here are the results:

.  list, sepby(idno _spell)

     +---------------------------------------------+
     | idno   year   income   _seq   _spell   _end |
     |---------------------------------------------|
  1. |    1   2008      100      1        1      0 |
  2. |    1   2009      100      2        1      0 |
  3. |    1   2010      100      3        1      0 |
  4. |    1   2011      100      4        1      0 |
  5. |    1   2012      100      5        1      0 |
  6. |    1   2013      100      6        1      0 |
  7. |    1   2014      100      7        1      0 |
  8. |    1   2015      100      8        1      0 |
  9. |    1   2016      100      9        1      0 |
 10. |    1   2017      100     10        1      0 |
 11. |    1   2018      100     11        1      1 |
     |---------------------------------------------|
 12. |    2   2008      100      1        1      0 |
 13. |    2   2009      100      2        1      0 |
 14. |    2   2010      100      3        1      1 |
     |---------------------------------------------|
 15. |    3   2009      100      1        1      0 |
 16. |    3   2010      100      2        1      1 |
     |---------------------------------------------|
 17. |    3   2011        .      0        0      0 |
 18. |    3   2012        .      0        0      0 |
 19. |    3   2013        .      0        0      0 |
 20. |    3   2014        .      0        0      0 |
     |---------------------------------------------|
 21. |    3   2015      100      1        2      0 |
 22. |    3   2016      100      2        2      1 |
     +---------------------------------------------+


.  list if _seq == 1 | _end, sepby(idno _spell)

     +---------------------------------------------+
     | idno   year   income   _seq   _spell   _end |
     |---------------------------------------------|
  1. |    1   2008      100      1        1      0 |
 11. |    1   2018      100     11        1      1 |
     |---------------------------------------------|
 12. |    2   2008      100      1        1      0 |
 14. |    2   2010      100      3        1      1 |
     |---------------------------------------------|
 15. |    3   2009      100      1        1      0 |
 16. |    3   2010      100      2        1      1 |
     |---------------------------------------------|
 21. |    3   2015      100      1        2      0 |
 22. |    3   2016      100      2        2      1 |
     +---------------------------------------------+