I have a flight database with 4 columns like shown below.
Original:
I want an output which gives rows based on unique combination of 3 col (origin/destination/Airline), sums the number of passengers for each unique combination and count the numbers of rows for each unique combination. The result would be something like this.
Output:
I am able to do 1 part of it using the group_by
function
df %>% group_by(Origin, destination, carrier) %>% summarise(count = n())
How to include the sum of population?
We can use dplyr
library(dplyr)
df1 %>%
group_by(Origin, Destination, Airline) %>%
dplyr::summarise(count = n(), TotalPassengers = sum(Passengers))
# Groups: Origin, Destination [2]
# Origin Destination Airline count TotalPassengers
# <chr> <chr> <chr> <int> <dbl>
#1 ABE ATL 9A 2 3
#2 ABE ATL DL 1 5
#3 NYC SFA AA 3 21
#4 NYC SFA DL 1 5
df1 <- data.frame(Origin = rep(c("ABE", "NYC"), c(3, 4)),
Destination = rep(c("ATL", "SFA"), c(3, 4)),
Airline = c("9A", "9A", "DL", "AA", "AA", "AA", "DL"),
Passengers = c(2, 1, 5, 4, 10, 7, 5))