Search code examples
rdplyrsummarize

Continual error with summarize function dplyr


I am trying to calculate the mean, median, min, max across all variables across the grouping Site using the summarize function. In my code, I replace NA with 0, but I am also open to utilizing na.rm=TRUE instead if it easy to incorporate.

I keep getting the following error message and cannot figure it out...

Error: Problem with `summarise()` input `..2`. i `..2 = list(mean, median, min, max)`. x `..2` must be size 6 or 1, not 4. i An earlier column had size 6. i The error occurred in group 1: Site = 1.

Below is my data and code:

Dataset Reprex

data = structure(list(Site = c(7, 1, 7, 7, 1, 1, 7, 1, 6, 1, 1), OS_days = c(264, 
208, 184, 145, 131, 116, 82, 74, 76, 82, 68), ster_days = c(241, 
135, 184, NA, 85, 106, NA, NA, NA, NA, 69), pct_ster = c(0.912878787878788, 
0.649038461538462, 1, NA, 0.648854961832061, 0.913793103448276, 
NA, NA, NA, NA, 1.01470588235294), first_ster_days = c(28, 72, 
1, NA, 42, 1, NA, NA, NA, NA, 1), tot_bev_days = c(1, 13, NA, 
NA, NA, 75, NA, NA, NA, NA, NA), pct_bev = c(0.00378787878787879, 
0.0625, NA, NA, NA, 0.646551724137931, NA, NA, NA, NA, NA), first_bev_days = c(48, 
86, NA, NA, NA, 22, NA, NA, NA, NA, NA), SPD = structure(c(1219.86, 
1107, 1508, 442.74, 524.61, 1733.76, 2079.77, 443.44, NA, 601.8, 
1621.3), label = "Measurement Number 1 mm")), row.names = c(NA, 
-11L), class = c("tbl_df", "tbl", "data.frame"))
knitr::kable(data, digits = 3)


| Site| OS_days| ster_days| pct_ster| first_ster_days| tot_bev_days| pct_bev| first_bev_days|     SPD|
|----:|-------:|---------:|--------:|---------------:|------------:|-------:|--------------:|-------:|
|    7|     264|       241|    0.913|              28|            1|   0.004|             48| 1219.86|
|    1|     208|       135|    0.649|              72|           13|   0.062|             86| 1107.00|
|    7|     184|       184|    1.000|               1|           NA|      NA|             NA| 1508.00|
|    7|     145|        NA|       NA|              NA|           NA|      NA|             NA|  442.74|
|    1|     131|        85|    0.649|              42|           NA|      NA|             NA|  524.61|
|    1|     116|       106|    0.914|               1|           75|   0.647|             22| 1733.76|
|    7|      82|        NA|       NA|              NA|           NA|      NA|             NA| 2079.77|
|    1|      74|        NA|       NA|              NA|           NA|      NA|             NA|  443.44|
|    6|      76|        NA|       NA|              NA|           NA|      NA|             NA|      NA|
|    1|      82|        NA|       NA|              NA|           NA|      NA|             NA|  601.80|
|    1|      68|        69|    1.015|               1|           NA|      NA|             NA| 1621.30|

Code

data %>%
  replace(is.na(.), 0) %>%
  group_by(Site) %>%
  dplyr::summarise(across(c(OS_days, ster_days, pct_ster, first_ster_days, tot_bev_days, pct_bev, first_bev_days, SPD)), list(mean, median, min, max)) 

Solution

  • The bracket for across ) was closed too early

    library(dplyr)
    data %>%
      replace(is.na(.), 0) %>% 
      group_by(Site) %>%
      dplyr::summarise(across(c(OS_days, ster_days, pct_ster, 
          first_ster_days, tot_bev_days, pct_bev, first_bev_days, SPD), 
            list(mean, median, min, max)))
    

    -output

    # A tibble: 3 x 33
       Site OS_days_1 OS_days_2 OS_days_3 OS_days_4 ster_days_1 ster_days_2 ster_days_3 ster_days_4 pct_ster_1 pct_ster_2 pct_ster_3 pct_ster_4 first_ster_days_1
      <dbl>     <dbl>     <dbl>     <dbl>     <dbl>       <dbl>       <dbl>       <dbl>       <dbl>      <dbl>      <dbl>      <dbl>      <dbl>             <dbl>
    1     1      113.       99         68       208        65.8          77           0         135      0.538      0.649          0       1.01             19.3 
    2     6       76        76         76        76         0             0           0           0      0          0              0       0                 0   
    3     7      169.      164.        82       264       106.           92           0         241      0.478      0.456          0       1                 7.25
    # … with 19 more variables: first_ster_days_2 <dbl>, first_ster_days_3 <dbl>, first_ster_days_4 <dbl>, tot_bev_days_1 <dbl>, tot_bev_days_2 <dbl>,
    #   tot_bev_days_3 <dbl>, tot_bev_days_4 <dbl>, pct_bev_1 <dbl>, pct_bev_2 <dbl>, pct_bev_3 <dbl>, pct_bev_4 <dbl>, first_bev_days_1 <dbl>,
    #   first_bev_days_2 <dbl>, first_bev_days_3 <dbl>, first_bev_days_4 <dbl>, SPD_1 <dbl>, SPD_2 <dbl>, SPD_3 <dbl>, SPD_4 <dbl>