Search code examples
rdplyrdoublemeantibble

how to take a mean of a vector of doubles


library(nycflights13)
flights = nycflights13::flights

flights %>% select(arr_delay, month) %>% group_by(month) %>% filter(!is.na(arr_delay))

My goal is to get the mean arrival delay for each month, but every time I try to take the mean, I get an error


Solution

  • There is a na.rm argument in mean, so no need to filter, instead use mean within summarise

    library(dplyr)
    flights %>% 
        select(arr_delay, month) %>%
        group_by(month) %>% 
        summarise(Mean = mean(arr_delay, na.rm = TRUE))
    

    -output

    # A tibble: 12 × 2
       month   Mean
       <int>  <dbl>
     1     1  6.13 
     2     2  5.61 
     3     3  5.81 
     4     4 11.2  
     5     5  3.52 
     6     6 16.5  
     7     7 16.7  
     8     8  6.04 
     9     9 -4.02 
    10    10 -0.167
    11    11  0.461
    12    12 14.9