Assume we have some random data:
data <- data.frame(ID = rep(seq(1:3),3),
Var = sample(1:9, 9))
we can compute summarizing operations using dplyr
, like this:
library(dplyr)
data%>%
group_by(ID)%>%
summarize(count = n_distinct(Var))
which gives output that looks like this below an r markdown chunk:
ID count
1 3
2 3
3 3
I would like to know how we can perform operations on individual data points in this dplyr
output without saving the output in a separate object.
For example in the output of summarise
, lets say we wanted to subtract the output value for ID == 3
from the sum of the output values for ID == 1
and ID == 2
, and leave the output values for ID == 1
and ID == 2
like they are. The only way I know to do this is to save the summary output in another object and perform the operation on that object, like this:
a<-
data%>%
group_by(ID)%>%
summarize(count = n_distinct(Var))
a
#now perform the operation on a
a[3,2] <- a[2,1]+a[2,2]-1
a
a
now looks like this:
ID count
1 3
2 3
3 4
Is there a way to do this in dplyr
output without making new objects? Can we somehow use mutate
directly on output like this?
We can add a mutate
after the summarise
with replace
to modify the location specified in list
library(dplyr)
data%>%
group_by(ID)%>%
summarize(count = n_distinct(Var)) %>%
mutate(count = replace(count, n(), count[2] + ID[2] - 1))
-output
# A tibble: 3 x 2
ID count
<int> <dbl>
1 1 3
2 2 3
3 3 4
Or if there are more than two columns, use sum
on the slice
d row
data%>%
group_by(ID)%>%
summarize(count = n_distinct(Var)) %>%
mutate(count = replace(count, n(), sum(cur_data() %>%
slice(2)) - 1))