I have data of train trips and the number of delayed or cancelled trains that I would like to make the sum.
Start End Delayed Cancelled
Paris Rome 1 0
Brussels Berlin 4 6
Berlin Brussels 6 2
Rome Paris 2 1
How can I group the Start and End columns to make the sum of Paris-Rome and Rome-Paris and Brussels-Berlin and Berlin Brussels to make the total of delayed and cancelled trips by train line ?
Order alphabetically and summarise by group:
df |>
group_by(route = if_else(Start < End, paste(Start, End, sep = "-"), paste(End, Start, sep = "-"))) |>
summarise(Delayed = sum(Delayed), Cancelled = sum(Cancelled))
# route Delayed Cancelled
# <chr> <int> <int>
# 1 Berlin-Brussels 10 8
# 2 Paris-Rome 3 1
Reproducible data:
df = data.frame(
Start = c("Paris", "Brussels", "Berlin", "Rome"),
End = c("Rome", "Berlin", "Brussels", "Paris"),
Delayed = c(1L, 4L, 6L, 2L),
Cancelled = c(0L, 6L, 2L, 1L)
)