I have a dataset that looks like the following, where each ID has 3 levels, and where one of those levels has a value (and all other levels within that ID are 0):
ID level value
1 1 0
1 2 0
1 3 1
2 1 0
2 2 1
2 3 0
I need to return a similar dataframe, with an additional column which specifies which row within the ID has the value 1. In this case:
ID level value which
1 1 0 3
1 2 0 0
1 3 1 0
2 1 0 2
2 2 1 0
2 3 0 0
I feel like I should be able to create this somehow by group_by(ID) and then a mutate based on a case_when that refers to the rows relative to the group (i.e. if it is the 1st, 2nd, or 3rd row), but I can't crack how that should work.
Any suggestions are much appreciated!
You can use which
or better which.max
which is guaranteed to return only 1 value.
library(dplyr)
df %>%
group_by(ID) %>%
mutate(which = which.max(value) * +(row_number() == 1))
# ID level value which
# <int> <int> <int> <int>
#1 1 1 0 3
#2 1 2 0 0
#3 1 3 1 0
#4 2 1 0 2
#5 2 2 1 0
#6 2 3 0 0
+(row_number() == 1)
is to ensure that the value of which
is assigned to only 1st row in the group and rest all the rows are 0.