I have the following data:
structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L),
form = c("test", "test", "missing", "test", "test", "missing",
"test", "test", "test", "missing")), row.names = c(NA, 10L), class = "data.frame")
id | form |
---|---|
1 | test |
1 | test |
1 | missing |
1 | test |
2 | test |
2 | missing |
2 | test |
3 | test |
3 | test |
3 | missing |
I need to add a column ("form_completed") that starts counting from 1 by "id", but skips the counting for a certain value under "form" which is a string variable. The output should be:
structure(list(id = c(1L, 1L, 1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L),
form = c("test", "test", "missing", "test", "test", "missing",
"test", "test", "test", "missing"), form_completed = c(1L,
2L, NA, 3L, 1L, NA, 2L, 1L, 2L, NA)), row.names = c(NA, 10L), class = "data.frame")
id | form | form_completed |
---|---|---|
1 | test | 1 |
1 | test | 2 |
1 | missing | NA |
1 | test | 3 |
2 | test | 1 |
2 | missing | NA |
2 | test | 2 |
3 | test | 1 |
3 | test | 2 |
3 | missing | NA |
It seems pretty straightfoward, and I have tried different things with mutate() and row_count() in R, but can't find it to work. Any help would be much appreciated!
dplyr
You can use case_when
+ cumsum
:
library(dplyr)
df %>%
group_by(id) %>%
mutate(a = case_when(form == 'test' ~ cumsum(form == "test")))
id form a
<int> <chr> <int>
1 1 test 1
2 1 test 2
3 1 missing NA
4 1 test 3
5 2 test 1
6 2 missing NA
7 2 test 2
8 3 test 1
9 3 test 2
10 3 missing NA
Or with ifelse
:
df %>%
group_by(id) %>%
mutate(a = ifelse(form == 'test', cumsum(form == "test"), NA))
base R
df$a[df$form == "test"] <- with(df[df$form == "test", ], ave(form, id, FUN = seq_along))
id form a
1 1 test 1
2 1 test 2
3 1 missing <NA>
4 1 test 3
5 2 test 1
6 2 missing <NA>
7 2 test 2
8 3 test 1
9 3 test 2
10 3 missing <NA>