Take the following patient data example from a hospital.
YEAR <- sample(1980:1995,15, replace=T)
Pat_ID <- sample(1:100,15)
sex <- c(1,0,1,0,1,0,0,1,0,0,0,0,1,0,0)
df1 <- data.frame(Pat_ID,YEAR,sex)
I want to introduce a dummy variable $PAIR_IDENTIFIER
that takes a new value each time a new sex==1
appears. The problem is there is no constant patern to the sex
variable.
You see sometimes the succeeding 1
appears in the ith+2
position and then ith+3
position etc.
so $PAIR_IDENTIFIER <- c(1,1,2,2,3,3,3,4,4,4,4,4 .....)
You can do this by simply using the cumsum
,
df1$PAIR_IDENTIFIER <- cumsum(df1$sex)
df1
# Pat_ID YEAR sex PAIR_IDENTIFIER
#1 54 1991 1 1
#2 100 1992 0 1
#3 6 1995 1 2
#4 99 1994 0 2
#5 42 1988 1 3
#6 65 1990 0 3
#7 53 1994 0 3
#8 96 1987 1 4