Search code examples
statapanel-data

How to modify a variable conditioned on max value of other variable


I have a long format dataset: ID, time varying variable, time and outcome (y).

Subjects have differing numbers of rows due to different times and different outcome values, 0,1 or 2. But I need to only keep the outcome value corresponding to the last time point, and replace all other outcome rows to 0.

I can't figure out how to gen a new variable = outcome only for max(time) by ID

id  sbp y   time
1   120 1   0
1   126 1   1
1   126 1   2
1   126 1   3 
1   126 1   4
1   132 1   5
1   132 1   6
1   132 1   7
1   150 1   8
1   150 1   9
1   150 1   10
1   160 1   11
1   160 1   12
1   160 1   13
1   160 1   14

Solution

  • You seem to be asking quite different things:

    1. Replacing outcome values before the last for each panel with 0.

    2. Keeping only the last.

    Here they are in turn:

    bysort id (time) : replace y = 0 if _n < _N 
    by id: keep if _n == _N 
    

    If you just want the second, you need bysort id (time) rather than by id.