Search code examples
statadata-cleaningrecodelongitudinal

Keep first record when event occurrs


I have the following data in Stata:

clear

* Input data
input grade id exit time
1   1   .   10
2   1   .   20
3   1   2   30
4   1   0   40
5   1   .   50
1   2   0   10
2   2   0   20
3   2   0   30
4   2   0   40
5   2   0   50
1   3   1   10
2   3   1   20
3   3   0   30
4   3   .   40
5   3   .   50
1   4   .   10
2   4   .   20
3   4   .   30
4   4   .   40
5   4   .   50
1   5   1   10
2   5   2   20
3   5   1   30
4   5   1   40
5   5   1   50

end

The objective is to take the first row foreach id when a event occurs and if no event occur then take the last report foreach id. Here is a example for the data I hope to attain

* Input data
input grade id  exit    time
3   1   2   30
5   2   0   50
1   3   1   10
5   4   .   50
1   5   1   10
end

Solution

  • The definition of an event appears to be that exit is not zero or missing. If so, then all you need to do is tweak the code in my previous answer:

    bysort id (time): egen when_first_e = min(cond(exit > 0 & exit < ., time, .))
    by id: gen tokeep = cond(when_first_e == ., time == time[_N], time == when_first_e) 
    

    Previous thread was here.