I would like to flag itemid
when the score
variable is missing any score between 0
and max.score
of that item. Here is an example dataset.
df <- data.frame(id = c(1,2,3, 1,2,3, 1,2,3, 1,2,3, 1,2,3),
itemid = c(11,11,11, 12,12,12, 13,13,13, 14,14,14, 15,15,15),
score = c(0,1,2, 1,2,2, 0,1,1, 2,0,0, 1,1,1),
max.points = c(2,2,2, 2,2,2, 1,1,1, 2,2,2, 1,1,1 ))
> df
id itemid score max.points
1 1 11 0 2
2 2 11 1 2
3 3 11 2 2
4 1 12 1 2
5 2 12 2 2
6 3 12 2 2
7 1 13 0 1
8 2 13 1 1
9 3 13 1 1
10 1 14 2 2
11 2 14 0 2
12 3 14 0 2
13 1 15 1 1
14 2 15 1 1
15 3 15 1 1
In this case, itemid=12
is missing score
of 0
, itemid=14
is missing score
of 1
, and itemid=15
is missing score
of 0
.
Any ideas on how to flag items 12,14 and 15?
Thanks!
Grouped by 'itemid', create the flag
by checking if all
of the sequence of max.points
(from 0) are %in%
'score' and negate (!
)
library(dplyr)
df %>%
group_by(itemid) %>%
mutate(flag = !all(c(0, seq_len(first(max.points))) %in% score)) %>%
ungroup
-output
# A tibble: 15 × 5
id itemid score max.points flag
<dbl> <dbl> <dbl> <dbl> <lgl>
1 1 11 0 2 FALSE
2 2 11 1 2 FALSE
3 3 11 2 2 FALSE
4 1 12 1 2 TRUE
5 2 12 2 2 TRUE
6 3 12 2 2 TRUE
7 1 13 0 1 FALSE
8 2 13 1 1 FALSE
9 3 13 1 1 FALSE
10 1 14 2 2 TRUE
11 2 14 0 2 TRUE
12 3 14 0 2 TRUE
13 1 15 1 1 TRUE
14 2 15 1 1 TRUE
15 3 15 1 1 TRUE