I try to generate yearwise summary statistics as follows:
data %>%
group_by(year) %>%
summarise(mean.abc = mean(abc), mean.def = mean(def), sd.abc = sd(abc), sd.def = sd(def))
This code returns a row vector filled with NA in the respective columns
mean.abc mean.def sd.abc sd.def
1 NA NA NA NA
So, I tried to work this out and replicated some examples
data(mtcars)
mtcars %>%
group_by(cyl) %>%
summarise(mean = mean(disp))
And this script returns
mean
1 230.7219
So, what am I doing wrong? I am loading the following packages:
loadpackage( c("foreign","haven", "tidyverse", "plyr", "stringr", "eeptools", "factoextra") )
Thanky for your support!
Your issue is that the summarise
-function from the plyr
-package does not do what you expect it to do.
See the difference between:
library(tidyverse)
mtcars %>%
group_by(cyl) %>%
plyr::summarise(mean = mean(disp))
#> mean
#> 1 230.7219
and
mtcars %>%
group_by(cyl) %>%
dplyr::summarise(mean = mean(disp))
#> # A tibble: 3 x 2
#> cyl mean
#> <dbl> <dbl>
#> 1 4 105.
#> 2 6 183.
#> 3 8 353.
Since your data seems to have missing values, this should do the trick:
data %>%
group_by(year) %>%
dplyr::summarise(across(all_of(c('abc', 'def')),
.fns = list(mean = ~mean(.,na.rm=T),
sd = ~sd(.,na.rm=T))))