I am handling a panel data like df. So I have a unique UserID (Uid), a TimeVariable (TV) and my main variable of interest(MV, dummy coded). Each row represents a unique UiD TV combination. Now I want to create a new variable (NV), which counts basically the number of events (MV=1) that happend up to time TV.
The contect: I want to have a count variable (NV) that basically counts for each moment in time how often Event MV=1 happend up to that moment (TV).
So far I can only manage to sum it up per UserId but not having it on a increasing level.
df <- df %>% group_by(user_id) %>% mutate(NV=count_if(1,MV))
The result is given to my on a aggreate basis per user ID for the MV
So the df looks like:
UI TV MV
1 1 0
1 2 1
1 3 0
2 1 0
2 2 0
2 3 1
2 4 2
3 1 1
3 2 0
3 3 1
3 4 1
The resuls with my code so far is:
UI TV MV NV
1 1 0 1
1 2 1 1
1 3 0 1
2 1 0 2
2 2 0 2
2 3 1 2
2 4 2 2
3 1 1 3
3 2 0 3
3 3 1 3
3 4 1 3
What I actually want:
UI TV MV NV
1 1 0 0
1 2 1 1
1 3 0 1
2 1 0 0
2 2 0 0
2 3 1 1
2 4 2 2
3 1 1 1
3 2 0 1
3 3 1 2
3 4 1 3
Thanks very much in advance for you help!
We could group_by
UI
and take cumulative sum of occurrence of MV == 1
library(dplyr)
df %>%
group_by(UI) %>%
mutate(NV = cumsum(MV == 1))
# UI TV MV NV
# <int> <int> <int> <int>
# 1 1 1 0 0
# 2 1 2 1 1
# 3 1 3 0 1
# 4 2 1 0 0
# 5 2 2 0 0
# 6 2 3 1 1
# 7 2 4 2 1
# 8 3 1 1 1
# 9 3 2 0 1
#10 3 3 1 2
#11 3 4 1 3
In base R, we can use ave
with(df, ave(MV == 1, UI, FUN = cumsum))
#[1] 0 1 1 0 0 1 1 1 1 2 3
data
df <- structure(list(UI = c(1L, 1L, 1L, 2L, 2L, 2L, 2L, 3L, 3L, 3L,
3L), TV = c(1L, 2L, 3L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L), MV = c(0L,
1L, 0L, 0L, 0L, 1L, 2L, 1L, 0L, 1L, 1L)), class = "data.frame",
row.names = c(NA,
-11L))