I am trying to summarize all columns in my dataset (which has many columns, although the rep example bellow only has 2), getting mean and sd for each variable. I want the output to be in long format.
#Example dataset
d <- iris %>% select(Sepal.Length,Sepal.Width)
names(d) <- c("SepalLength","SepalWidth")
#Summarizing and trying to make it long
s <- d %>% summarize_each( list(mean=mean,sd=sd) ) # has summar stats, but they are in wide format
# trying to pivot.
s %>% pivot_longer( ??? what do I put here ???)
I tried a few variations (ex: pivot_longer(names_to = "key", values_to = "value")
) but always get an error message.
We can use select_helpers
inside
library(dplyr)
library(tidyr)
s %>%
pivot_longer(everything())
# A tibble: 4 x 2
# name value
# <chr> <dbl>
#1 SepalLength_mean 5.84
#2 SepalWidth_mean 3.06
#3 SepalLength_sd 0.828
#4 SepalWidth_sd 0.436
Or if we need 'SepalLength', 'SepalWidth' as two columns
s %>%
pivot_longer(cols = everything(),
names_to = c(".value", "statistic"), names_sep="_")
# A tibble: 2 x 3
# statistic SepalLength SepalWidth
# <chr> <dbl> <dbl>
#1 mean 5.84 3.06
#2 sd 0.828 0.436
Or if we need the 'mean', 'sd' as two columns
s %>%
pivot_longer(cols = everything(),
names_to = c("colNames", ".value"), names_sep="_")
# A tibble: 2 x 3
# colNames mean sd
# <chr> <dbl> <dbl>
#1 SepalLength 5.84 0.828
#2 SepalWidth 3.06 0.436
Or using gather
s %>%
gather