I'm looking to perform operations for one column based on grouping for another column.
Say I have the following data:
user <- c(1, 1, 1, 1, 2, 2, 2, 2, 3, 3, 3, 3)
score <- c(1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 1)
time_1 <- c(130, NA, 120, 245, NA, NA, NA, 841, NA, NA, 721, 612)
time_2 <- c(NA, 742, NA, NA, 812, 212, 214, NA, 919, 528, NA, NA)
df <- data.frame(user, score, time_1, time_2)
We get the following df:
user score time_1 time_2
1 1 130 NA
1 0 NA 742
1 1 120 NA
1 1 245 NA
2 0 NA 812
2 0 NA 212
2 0 NA 214
2 1 841 NA
3 0 NA 919
3 0 NA 528
3 1 721 NA
3 1 612 NA
For every user 1, what is the smallest value of time_1
?
So I am looking to group users by their number, and perform an operation on column time_1
.
Update on OP request(see comments):
Just replace summarise
with mutate
:
df %>%
group_by(user) %>%
mutate(Smallest_time1 = min(time_1, na.rm=TRUE))
user score time_1 time_2 Smallest_time1
<dbl> <dbl> <dbl> <dbl> <dbl>
1 1 1 130 NA 120
2 1 0 NA 742 120
3 1 1 120 NA 120
4 1 1 245 NA 120
5 2 0 NA 812 841
6 2 0 NA 212 841
7 2 0 NA 214 841
8 2 1 841 NA 841
9 3 0 NA 919 612
10 3 0 NA 528 612
11 3 1 721 NA 612
12 3 1 612 NA 612
We could use min()
inside summarise
with na.rm=TRUE
argument:
library(dplyr)
df %>%
group_by(user) %>%
summarise(Smallest_time1 = min(time_1, na.rm= TRUE))
user Smallest_time1
<dbl> <dbl>
1 1 120
2 2 841
3 3 612