Search code examples
rdplyrsummarizedeprecation-warning

Reworking old code with depreciated funs() and cannot get n() to work


I have some older code that I am trying to rework since funs() has been depreciated (I know, I'm way behind!). I use the output this style of summarise_if gives often, but cannot get it to work with list().
Older Code:

iris_means<-iris %>% 
      group_by(Species) %>% 
      summarise_if(is.numeric,funs(N=n(),mean,sd, se=sd(.)/sqrt(n()))) %>% 
      ungroup()

I tried this as I though I was getting the same error because of another package masking n(), but apparently I am doing something else wrong as I still get the error:Error in n(): ! Must only be used inside data-masking verbs like mutate(), filter(), and group_by().

iris_means<-iris %>% 
  group_by(Species) %>% 
  dplyr::summarise_if(is.numeric,list(N=n(),mean,sd, se=sd(.)/sqrt(n()))) %>% 
  ungroup()

How can I update this code to make it work correctly and give the same column names as before funs() is totally gone?


Solution

  • Using across and where you could rewrite your code like so:

    library(dplyr)
    
    iris %>%
      group_by(Species) %>%
      summarise(across(
        where(is.numeric),
        list(N = ~ n(), mean = mean, sd = sd, se = ~ sd(.) / sqrt(n()))
      )) %>%
      ungroup()
    #> # A tibble: 3 × 17
    #>   Species    Sepal.Len…¹ Sepal…² Sepal…³ Sepal…⁴ Sepal…⁵ Sepal…⁶ Sepal…⁷ Sepal…⁸
    #>   <fct>            <int>   <dbl>   <dbl>   <dbl>   <int>   <dbl>   <dbl>   <dbl>
    #> 1 setosa              50    5.01   0.352  0.0498      50    3.43   0.379  0.0536
    #> 2 versicolor          50    5.94   0.516  0.0730      50    2.77   0.314  0.0444
    #> 3 virginica           50    6.59   0.636  0.0899      50    2.97   0.322  0.0456
    #> # … with 8 more variables: Petal.Length_N <int>, Petal.Length_mean <dbl>,
    #> #   Petal.Length_sd <dbl>, Petal.Length_se <dbl>, Petal.Width_N <int>,
    #> #   Petal.Width_mean <dbl>, Petal.Width_sd <dbl>, Petal.Width_se <dbl>, and
    #> #   abbreviated variable names ¹​Sepal.Length_N, ²​Sepal.Length_mean,
    #> #   ³​Sepal.Length_sd, ⁴​Sepal.Length_se, ⁵​Sepal.Width_N, ⁶​Sepal.Width_mean,
    #> #   ⁷​Sepal.Width_sd, ⁸​Sepal.Width_se
    

    And using dplyr >= 1.1.0 we could get rid of group_by + ungroup by using the .by argument like so (Thx to @Edo for the suggestion):

    iris %>%
      summarise(
        across(
          where(is.numeric),
          list(N = ~ n(), mean = mean, sd = sd, se = ~ sd(.) / sqrt(n()))
        ),
        .by = Species
      )