I've some R-code which does, what I want it to do. But now the question: Is there any mechanism to avoid coding A1 A2 A3 and so on? I would like to code A* for all columns beginning with A. There can be any number of "A" columns in dependency to a list length which is definied in the code. The rest of the code is dynamic, but here I have a manual intervention (add some A columns or delete some A columns within the summerise statement).
I have found summarize_at, but I don't see how I can do the other things like last() and sum() at the same time for the other columns.
l_af <- l_cf %>%
group_by(PID, Server) %>%
summarise(Player=last(Player),
Guild=last(Guild),
Points=last(Points),
Battles=last(Battles),
A1=max(A1),
A2=max(A2),
A3=max(A3),
A4=max(A4),
A5=max(A5),
A6=max(A6),
RecCount=sum(RecCount))
Any help is appreciated.
The problem with using summarise
it is removes all other columns if they are not used. You can consider to use mutate
first perform all the operations and then use summarise
.
library(dplyr)
l_cf %>%
group_by(PID, Server) %>%
mutate_at(vars(Player,Guild,Points,Battles), last) %>%
mutate_at(vars(starts_with("A")), max) %>%
mutate(RecCount = sum(RecCount)) %>%
summarise_all(max)
A reproducible example
set.seed(123)
df <- data.frame(group = rep(1:5, 2), x = runif(10), y = runif(10),
a1 = runif(10), a2 = runif(10), z = runif(10))
First applying functions individually for each column
df %>%
group_by(group) %>%
summarise(x=last(x),
y=last(y),
a1=max(a1),
a2=max(a2),
z=sum(z))
# A tibble: 5 x 6
# group x y a1 a2 z
# <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 0.0456 0.900 0.890 0.963 0.282
#2 2 0.528 0.246 0.693 0.902 0.648
#3 3 0.892 0.0421 0.641 0.691 0.880
#4 4 0.551 0.328 0.994 0.795 0.635
#5 5 0.457 0.955 0.656 0.232 1.01
Now apply the functions together for multiple columns
df %>%
group_by(group) %>%
mutate_at(vars(x, y), last) %>%
mutate_at(vars(starts_with("a")), max) %>%
mutate(z = sum(z)) %>%
summarise_all(max)
# group x y a1 a2 z
# <int> <dbl> <dbl> <dbl> <dbl> <dbl>
#1 1 0.0456 0.900 0.890 0.963 0.282
#2 2 0.528 0.246 0.693 0.902 0.648
#3 3 0.892 0.0421 0.641 0.691 0.880
#4 4 0.551 0.328 0.994 0.795 0.635
#5 5 0.457 0.955 0.656 0.232 1.01
We can see that both the approaches gave the same output.